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Abstract 

A detailed proof of hard scattering factorization is given with the inclusion of 
heavy quark masses. Although the proof is explicitly given for deep-inelastic 
^ . scattering, the methods apply more generally The power-suppressed correc- 

I tions to the factorization formula are uniformly suppressed by a power of 

I A/Q, independently of the size of heavy quark masses, M, relative to Q. 

o 

00 ■ I. INTRODUCTION 



A correct treatment of heavy quarks in higher-order perturbative QCD calculations is 
important |T|-|TT| to precision phenomenology. Among the reasons is the fact that a sub- 
stantial fraction of the deep-inelastic cross section at HERA is in heavy quark production. 
^ ■ Moreover, this occurs in a region where the heavy quark masses are not necessarily negligible 

with respect to the large momentum scales in the problem (like Q). 
^ [ However, there is a considerable confusion P,|6HTll about what constitute correct meth- 

^ \ ods for treating heavy quarks. Some of the difficulties occur because many treatments 

assume that quarks either are so light that their masses are negligible with respect to Q or 
have masses that are of order Q, where Q denotes a typical scale for the hard-scattering 
process under discussion. One has to be able to handle the intermediate region, where Q is 
somewhat larger than a quark mass but not enormously much larger. 

Even when Q is much larger than all quark masses, the intermediate region must still 
be treated, because evolution equations are used to obtain the strong coupling, the parton 
densities, and the fragmentation functions from starting values specified at scales of a few 
GeV. The symptoms of this issue are the different and apparently incompatible 'matching 
conditions' that have been proposed^. 

In this paper I will give a relatively simple and general proof of factorization including 
the effects of heavy quarks. The only issue that will not be treated is the cancellation of 
soft gluons, an issue which is essentially orthogonal to the ones which are causing problems. 



^ See for example 
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The key ingredient is the observation that the short-distance coefficient functions ('Wilson 
coefficients') can legitimately be calculated with the quark masses left non-zero. Previous 
work with Aivazis, Olness and Tung |^ and others ^ has used this method; what is new is 
the complete and detailed all-orders proof. 

This ffist main characteristic of the method, that quark masses are retained when neces- 
sary in the calculations of the coefficient functions, enables factorization to be valid when the 
masses of quarks are non-negligible with respect to the large scale Q of the hard scattering. 
Hence the method avoids the normal problem when the MS scheme is used with massless 
Wilson coefficients, that there are uncontrolled corrections of order a power of M/Q, where 
M is a heavy quark mass. 

The second main characteristic is that the renormalization and factorization scheme 
consists of a series of subschemes labeled by the number of 'active quark flavors', ua- This 
is simply a generalization of the CWZ scheme that is in standard use ||13| for the QCD 
coupling as- When discussing the numerical values of parton densities, it is necessary to 
specify the number of active flavors that is used in their definition, just as in the case of the 
coupling. 

The subschemes with different numbers of active flavors are useful in different ranges of 
physical scales, but with overlapping ranges of validity. Since the subschemes are related 



by definite matching conditions [T^,|T5[, the choice of the number of active flavors does not 
result in any more indefiniteness in the physical predictions than does the freedom to choose 
a scheme or a value of the renormalization/factorization scale. 

At first sight, the use of a sequence of subschemes instead of a single scheme appears 
rather baroque. However, it is in fact the simplest implementation of mass- dependent factor- 



ization [|T6| . We require that the schemes implement decoupling of heavy quarks when 
appropriate, and that they implement the closest possible scheme to the mass-independent 
MS scheme, which is commonly used for most perturbative QCD calculations. If one did 
not have a sequence of schemes, it would be necessary to have mass-dependent evolution 
equations. The CWZ scheme does have mass-dependent evolution in the following sense. If 
one chooses particular "thresholds" — more accurately called "switching points" — to change 
the number of active flavors, then the evolution kernels change at the thresholds. Moreover, 
the matching conditions at the thresholds can be thought of as corresponding delta-function 
contributions to the kernels. 

Some of the confusion in the literature can be traced to the supposition that Wilson 
coefficients must be calculated with massless quarks. Indeed, many papers, for example 
P,[18|,|19[, treat factorization as a question of factoring out mass divergences in a massless 
theory. Such methods founder when the quarks have non-negligible masses, since then some 
of the divergences are not literally present. It should be noted that the proof of factorization 
in 1^ does not assume that quarks are massless (contrary to the assertion in the proof 



merely assumes that one is treating a limit in which the scale of the hard scattering is much 
larger than all masses. 

Another source of problems is that many treatments of factorization |^,|T^,|TP| take as 
their starting point an assertion that hard cross sections are the convolution of 'bare parton 
densities' with unsubtracted 'partonic cross sections'. Although this assertion is widespread, 
it has no proof: it has the status of an unproved conjecture. Indeed it is not obvious that 
it is even true. However, this bare parton conjecture is not necessary either to the proof of 
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the factorization theorem or to its use. 

These problems with existing treatments, even without the treatment of heavy quarks, 
provide motivation for providing much detail in the proofs in this paper. The proofs apply 
equally well in the absence of heavy quarks. 

The treatment in this paper will be based on the basic power counting theorems derived 
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by Libby and Sterman [^] and on the methods of Curci, Furmanski and Petronzio 
for organizing sums of generalized ladder graphs. The treatment of heavy quarks uses the 
methods of Collins, Wilczek and Zee (CWZ) [1^. The powerful methods developed by 
Chetyrkin, Tkachov, and Gorishnii |T^,^,^ for the operator product expansion with mass 
effects are consistent with the CWZ scheme. 

The outline of this paper is as follows: In Sec. |T|, I explain the requirements that I consider 
it necessary to impose on a good treatment of mass effects. Then, in Sec. I review the 
CWZ scheme for renormalization. In that section, I also define a consistent terminology of 
'light' and 'heavy' quarks, and of 'partonic' (or 'active') and 'non-partonic' quarks. In Sees. 
rv| to p^ , I prove factorization in the case that there is one heavy quark and that Q is at 
least as large as the heavy quark mass; this is the case where the heavy quark is active. 
As an interlude in the formal proof, in Sec. 0, I provide a mathematical example of the 
asymptotics of certain integrals that mimic the behavior of the more complicated integrals in 
Feynman graphs for QCD. Then in Sec. I prove factorization for the case that the heavy 
quark may be treated as inactive. ('Non-partonic' is a better term.) The general case, that 
there are several heavy quarks of various masses, forms a relatively simple generalization of 
the preceding work, and is treated in Sec. 0. An account of the matching conditions and of 
the evolution equations is given in Sec. pCI4 This is followed by an account of the relation 
of the present scheme to the schemes of other authors, in Sec. PCIII . The conclusions are 
in Sec. pClV[ . In App. 0, I explain a certain mathematical complication that appears in the 
middle of the proof. 



II. REQUIREMENTS FOR A GOOD FACTORIZATION SCHEME 

The overall aim of work such as ours is to represent interesting cross sections (or 
other quantities) in terms of perturbatively calculable quantities and a limited set of non- 
perturbative quantities that must at present be obtained from experiment. A typical result is 
that for deep-inelastic structure functions and other hard-scattering cross sections we have 
factorization theorems: the leading large Q behavior is a convolution of hard-scattering 
coefficients, which can be perturbatively calculated, and of parton densities and/or frag- 
mentation functions. There are also evolution equations for the parton densities, etc., for 
which the evolution kernels are perturbatively calculable. 

Although the factorization theorems are true in a general quantum field theory, and 
not just in QCD, their particular utility in QCD is caused by the asymptotic freedom 
of QCD. Without the use of factorization, perturbative calculations of typical scattering 
amplitudes and cross sections involve integrals down to low virtualities where the effective 
coupling is too large for low-order perturbation theory to be valid. Factorization theorems 
segregate the non-perturbative part of a cross section into a limited number of experimentally 
measurable parton densities, etc. Moreover, typical cross sections depend on several scales 
and perturbative calculations typically have one or two logarithms of ratios of scales for each 
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loop. Since the QCD coupling is not very small, the logarithms can ruin the accuracy of 
practical calculations. By working with quantities that each depend on a single scale, one 
avoids this loss of accuracy. 

For the purposes of this section, we will let Q be a (large) scale defining the kinematics 
of the hard-scattering process under discussion and we will let M denote the mass of some 
heavy quark. A satisfactory treatment should satisfy the following requirements: 

• The formalism should apply to all orders of perturbation theory and include arbitrarily 
non-leading logarithms. 

• Explicit definitions must be given of the non-perturbative quantities, as matrix ele- 
ments of operators. 

• The formalism is to be applicable to all the cases Q ^ M, Q M and Q ^ M , and 
the errors are suppressed by a power of A/Q. 

• Multiple heavy quarks should be treated without loss of accuracy no matter whether 
the ratios of the masses are large or not. 

The results in this paper will also satisfy some other requirements which are more matters 
of convenience than absolute principles: 

• When a quark mass is large enough for decoupling to apply, calculations should exhibit 
manifest decoupling. That is, they should reduce to calculations in a standard scheme 
(e.g., MS) in the theory with the heavy quarks omitted, and with no need to adjust 
the numerical values of the coupling. 

• The scheme should reduce to a standard scheme (e.g., MS) when the masses are much 
less than Q. We will in fact use the MS scheme, so that standard hard-scattering 
calculations can be used unchanged in the case that masses can be neglected. 

• The previous two requirements apply to both factorization and to the coupling a^. 

• The evolution equations for the parton densities, etc., should be homogeneous. That 
is, they should be of the form of conventional DGLAP equations or renormalization 



group equations rather than of the form of Callan-Symanzik equations [24|. (The 
solutions of Callan-Symanzik equations need an extra level of approximation to make 
them useful for calculations.) 



III. CWZ SCHEME 

The short-distance coefficient functions are almost completely determined once one has 
specified a scheme for defining the parton densities — in fact a scheme for renormalizing the 
ultra-violet divergences in the coupling and in the parton densities. The scheme defined in 
this paper is in fact a composite of a series of related schemes in the fashion proposed by 
CoUins, Wilczek and Zee (CWZ) 



First, it is necessary to introduce some terminology whose consistent use will aid our 
work. Let us define a 'light' quark or gluon to be one whose mass is of the order of A or 
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less, i.e., under about a GeV. Similarly, let us define a 'heavy' quark to be one whose mass 
is larger than a GeV or so, so that the the effective coupling, as{M), at the scale of a heavy 
quark mass is in the perturbative region. With this definition, the charm, bottom and top 
quarks are the heavy quarks. We let rii be the number of light quarks, and Uf he the total 
number of quarks. In our present state of knowledge of QCD we have n/ = 3 and Uf = 6. 

Each subscheme of the CWZ scheme is labeled by a number ha, which I will call the 
number of 'active ' (or 'partonic ') quarks. These are the ua lightest quarks. All the remaining 
quarks I call 'non-partonic '. (It is also possible to call them 'inactive but the term can be 
misleading.) In each subscheme: 

• Graphs that contain only active parton lines (i.e., gluons and active quarks) are renor- 
malized by MS counterterms, with the exception of the renormalization of the masses 
of heavy quarks. 

• Graphs all of whose external lines are active partons but which have internal non- 
partonic quark lines are renormalized by zero-momentum subtraction. 

• Heavy quark masses are defined as pole masses, as in the work of Smith, van Neerven 
and collaborators (We could also to choose to define heavy quark masses as 
MS without changing the formalism.) 

• Other graphs with external non-partonic lines are renormalized by MS counterterms. 

These definitions are applied to the renormalization of the interaction and to the renormal- 
ization of the parton densities, fragmentation functions, etc. 

A consequence of the definitions is that we will talk about 'three-flavor', 'four- flavor', 
etc., definitions of the coupling and parton densities (and fragmentation functions). Use of 
such a sequence of definitions is already common practice for the coupling [|T^], and identical 
considerations apply to the parton densities. As a consequence it is meaningful to specify 
numerical values of the coupling and of parton densities only if the number of active flavors 
is specified. There are perturbatively calculable relations, or matching conditions, between 
the values of these quantities with different numbers of active flavors. 

I will now list properties of this set of schemes that are important for our purposes. Their 
proofs are either in Ref. or are later in this paper. 



The scheme coincides with ordinary MS when all partons are active^, i.e., ua = n/. 

Manifest decoupling is obeyed. If we have a process in which all external momentum 
scales are much less than the masses of the non-partonic quarks, then we can omit all 
graphs containing non-partonic quarks and only make a power-suppressed error. In 
contrast, in a scheme that does not have manifest decoupling, we would have to adjust 
the numerical values of the couplings and of the parton densities. 



Except that we have chosen to define heavy quark masses as pole masses. 
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Evolution equations for the densities of active partons and of the couphng as are exactly 
those of a pure MS scheme in a theory with ua quark flavors. This is a consequence 
of the mass-independence of UV counterterms in the MS scheme, together with an 
application of the decoupling theorem . 



The relation between the subschemes is just a particular case of the relation between 
different renormalization schemes. The matching conditions between the schemes with 



different numbers of active quarks is known to three loops for the coupling [14| and to 



two loops for the parton densities [|T3|. The matching condition between quantities in 
the subschemes with and and + 1 active flavors involves no large logarithms of 
masses, provided that the renormalization/factorization scale fi is of order the mass 
of the {N + l)th quark. (For example, we would choose ^ to be of order the mass of 
the mass of the charm quark when we compute the relation between the three- and 
four- flavor schemes.) 

• In general, if one varies the physical scale Q of some process (e.g., deep-inelastic 
scattering), one should vary the number of active quarks suitably. Quarks of mass 
much less than Q are to be active, while quarks of mass much larger than Q should 
be non-partonic. One has a choice for those flavors whose masses are close to Q, and 
I suspect a bias in favor of keeping quarks non-partonic will lead to more accurate 
calculations. 

• The light partons are always to be treated as active. 

It might be considered odd that in a region where Q is of the order of the mass of some 
heavy quark we have a choice as to whether to treat the quark as active or not. The 
freedom is entirely comparable to the freedom to choose the precise value of the renormal- 
ization/factorization scale. Indeed the existence of a region where the two subschemes have 
comparable accuracy is vital to the success of a good treatment of heavy quarks, because 



it enables reliable perturbative calculations to be made of the matching conditions [0,[T5 
between the two subschemes. 

Commonly the scheme is implemented by choosing what can be called "match- 

ing" and "switching" points to be equal to the relevant heavy quark mass. For example, in 
treating DIS with a charm quark, one often sets the renormalization/factorization scale n 
to the kinematic variable Q. Then one uses a 3-flavor subscheme if /i < and a 4- flavor 
subscheme if /i > m^. One also chooses to evaluate the matching conditions between the 
subschemes at fi = rric. None of these choices is essential, and any change gives a change in 
the physical predictions only because of the errors due to the truncation of the perturbation 
series. It is probably only appropriate (i.e., suitable for fixed order perturbative calculations) 
to use a 4-flavor subscheme if one is treating a situation where the cross section is above 
the physical threshold for charm production, which is at Q = 2mc^x/(l — x). Hence, if 
X is rather large, then it would be appropriate to use the 3-flavor scheme even when Q is 
substantially above rric- 

Note that there are three distinct mass scales referred to in the previous paragraph: a 
matching point, a switching point and a physical threshold. 

Of course, one is free to disregard the CWZ scheme and use some other scheme, provided 
that it provides complete definitions of the parton densities and of the coupling. However, 
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this does not affect the vahdity of the CWZ definitions. The significance of the CWZ 
definitions is that when all flavors are active, they are exactly the MS ones. 



IV. BASICS OF FACTORIZATION WHEN Q>M 

The principles of the proof can be best explained by first considering the case that there 
is exactly one heavy quark, of mass M. There will be in effect two factorization theorems 
to prove. The first, whose treatment starts in this section, is appropriate when the physical 
scale Q of the hard scattering is at least at large in magnitude as M. In this case, it is 
appropriate to treat the heavy quark as active: the factorization theorem will include a 
term with a heavy quark density. 

The second case, whose treatment starts in Sec. ^ is appropriate when Q < M, and it 
treats the heavy quark as non-partonic. Then the factorization theorem has no term with 
a heavy quark density, and all heavy-quark production is to be found (at leading power) in 
the coefficient function. 

As mentioned earlier, there is an overlap region, Q ~ M, where both theorems are appro- 
priate, i.e., they give comparable accuracy in predictions based on finite-order calculations 
of coefficient functions. 

So in this section, we start the treatment of a factorization theorem for deep-inelastic 
structure functions, given the assumption that Q ^ M. A single factorization formula will 
cover the case that Q is much bigger than the heavy quark mass, as well as the case that 
Q and the heavy quark mass are comparable, and the intermediate region. Our notation 
for the photon momentum q, the hadron momentum p, and for the Bjorken variable x is 
standard. As usual = —q^ > 0. We will assume that quark masses are at most of order 
Q. 

When reading through the proof, it may be worth the reader's while to refer ahead to Sec. 



V|. There, a simple mathematical example is given of the kinds of integral under discussion. 



and it is possible more easily to see the meaning of the formal manipulations in the proof. 



A. Leading regions 

In the Bjorken limit (large Q, fixed x), the leading power behavior is given by the regions 
symbolized in Fig. |I|, as was proved by Libby and Sterman [^1],|26|. In each region, there is 
what we call a hard subgraph H, all of whose lines are effectively off-shell by order Q^. It 
is to this subgraph that the virtual photons couple. The rest of the graph has lines that are 
much lower in virtuality and that are approximately collinear to the momentum p of the 
target. The latter part of the graph we will call the target subgraph T. We will give more 
quantitative characterizations of the regions later. (For example, we must deal with the fact 
that there is a final-state cut, so that some lines in H are actually on-shell instead of having 
virtuality Q"^.) 

Although one often does purely perturbative calculations in which the target is a quark 
or gluon state, our treatment will also apply to hadron targets. In that case, suitable 
bound-state wave-functions will be incorporated in T. 
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FIG. 1. Regions for the leading power of structure functions have this structure. 

A result of the power counting is that for a contribution to have the leading power — 
to be of 'leading twist' — the two subgraphs H and T must be connected to each other by 
two parton lines, one on each side of the final-state cut. The set of decompositions into two 
such subgraphs H and T is in one-to-one correspondence with the set of leading regions. 
There are two exceptions to this correspondence. The main exception is that if the heavy 
quark mass is of order Q, then the H and T subgraphs are connected by light parton lines, 
but not by heavy quark lines. This exception arises because the definition of the region 
implies that the lines joining H and T have virtuality much less than Q, and this is not 
possible if the lines are heavy quark lines of a mass comparable to Q. The second exception 
to the power-counting rules is that gluons with scalar polarization can couple the H and T 
subgraphs without a power-law penalty, at least in a covariant gauge: we will discuss this 
issue in more detail later in the section. 

We define the subgraph T to include the full propagators of the lines joining it to H, 
since these lines have momenta collinear to the target. Hence the hard subgraph H is 
one-particle-irreducible (IPI) in these same lines. 

In this and later figures, we have the initial state at the bottom of the graph, and the 
hard subgraph to the left. This ensures that the orientation of the figures corresponds to 
the equations we will write for convolutions of amplitudes. For example, we can write Fig. 
I as • T. 

Any region of loop-momentum space that cannot be characterized by Fig. ^ is suppressed 
by a power of Q. Therefore the statement that the leading regions have the form of Fig. 
|T] is true to all orders in the coupling and includes not just the leading logarithms but all 
non-leading logarithms as well. 

A typical graph can have many different decompositions into hard and target subgraphs. 
For example. Fig. |^ has four such decompositions,^ and hence four leading regions. The 
possibility of having more than one leading region is characteristic not only of QCD, but of 
any renormalizable field theory, since adding extra lines inside if in a theory with a dimen- 
sionless coupling does not change the counting of powers of Q. It is the large multiplicity of 
regions that results in many of the complications in the proof of factorization. In addition, 



^ The one decomposition that may not be obvious is where H comprises the whole of the graph in 
Fig. ^ with the exception of the right-most two external lines. T is then a trivial graph, in essence 
a factor of unity. 
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FIG. 2. A graph with 3 decompositions of the form of Fig. |l[ 




FIG. 3. The handbag diagram that characterizes the only leading region in a su- 
per-renormalizable theory. 

it results in the logarithmic dependence on Q that is typical of higher order calculations in 
QCD. 

In contrast, super-renormalizable theories (e.g., QCD in less than four space-time di- 
mensions) have couplings with positive mass dimension. This implies that there is a single 
leading region. It is of the form of Fig. |l], but with the smallest possible graph for H. 
That is, the unique^ leading region has the form of the handbag diagram. Fig. |^. Although 
super-renormalizable theories do not represent real strong-interaction physics, experience in 
treating simple cases is useful in formulating the factorization theorem. Factorization, etc., 
for super-renormalizable theories is equivalent to the set of results obtained many years ago 
by Landshoff and Polkinghorne in context of their covariant parton model [P7|]. 

Let us now list some technical complications that we will be able to ignore, but that are 
treated in other papers |l20|,^[28[] on factorization: 



Although we have defined the target part T to consists only of lines with collinear 
momenta, it may in fact contain some highly virtual lines. These are confined to 
subgraphs that are ultra-violet divergent and just generate the usual UV divergences 
that are cancelled by counterterms in the Lagrangian. This complication does not 
affect our proofs, since none of the divergent subgraphs in QCD overlap between T 
and H, and our proofs will treat T black box. 

Although we treat the hard subgraph as being composed of lines all of which have 
large virtuality, this subgraph necessarily includes at least one final-state line. But 
after a sum over the possible final-state cuts, the hard subgraph is a discontinuity of 



But see the comments below concerning Fig. ^. 
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a certain Green function. Then the whole graph can be represented as a contour 
integral over a Green function in which all the lines in H are off-shell by order Q^. 
Thus H can indeed be treated as if its lines are all far off-shell. In particular, light- 
quark masses can legitimately be neglected compared to Q. A simple example is given 
by a super-renormalizable theory. Graphs with cut and uncut propagator corrections, 
Fig. ^, to the handbag diagram have the same power law in Q as the simple handbag 
diagram. Such graphs generate the correct final-state hadrons for the current-quark 
jet. After a sum over cuts, all such corrections cancel at the leading power of Q, and 
the structure function is correctly given by the lowest order handbag Fig. ^ 

Soft gluons can connect the different final-state jets, and can connect the final-state jets 
to the target subgraph. After a sum over final-state cuts these contributions cancel. 
This complication is only present in a theory with elementary vector fields, e.g., QCD. 
A cancellation can be proved, and for the purposes of this paper, we may assume that 
no complications result from the implementation of the cancellation of soft gluons. In 
more general processes, like the Drell-Yan process, the issue of soft-gluon cancellation 
is much more difficult • 



In a general gauge, there can be extra collinear gluon lines connecting T and H. Such 
gluons only contribute to the leading power if they have scalar polarization. How- 
ever, if a suitable 'physical' gauge is used (e.g., axial gauge with a gauge fixing vector 
proportional to g), such contributions are not present ^Tj. There are some subtleties 



associated with the use of such a gauge. For example, the analysis of the leading 



regions in Refs. [pl] , p6| relies critically on Landau's analysis of the singularities associ- 
ated with the denominators of Feynman propagators. But physical gauges introduce 
extra unphysical singularities — the physical gauges are not as physical as one often 
supposes. For the purposes of this paper it is sufficient to ignore this complication, or 
to assume that the appropriate light-like gauge is being used. 

The same phenomenon (in a covariant gauge) leads to what I term 'super-leading' 
contributions, when H and T are joined only by gluons that have scalar polarizations. 
It can be shown [^] that the super-leading contributions cancel after a sum over a 



'gauge-invariant set' of graphs for H, and that [p0| ,p9| the sum over attachments of 
scalar gluons to the hard part gives the correct gauge-invariant form of the parton 
densities, with a path-ordered exponential of the gluon field joining the two main 
parton vertices. 



B. Relation of leading regions to mass singularities 

To characterize the regions of momenta that Fig. |I] depicts, it is convenient to use light- 
front coordinates, where we write a 4-vector V asV^" = (\/+,1/-,Vt) with = ± 
V^)/\/2. Then we choose a coordinate frame such that 

= 
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FIG. 4. Handbag diagram with the final-state interactions that make the current quark jet. 



-xp 



0^ 



(1) 



The approximation in the definition of q represents the neglect of power suppressed terms, 
given that x is normally defined as Q"^ /2p ■ q. 

To exhibit the counting of powers of Q in its simplest form, we will choose to boost the 
frame in the z direction until p^ is of order Q. Then regions of momentum corresponding 
to the hard and target subgraphs are defined by saying that, for a momentum fc'^, 

• /c is in if if k~ is of order Q. 

• is in T if A;^ = {0{Q), o{Q), o{Q)), i.e., is of order Q, while fc^ and /ct are much 
smaller than Q, as is appropriate for a momentum collinear to the incoming hadron.[| 

After a sum over final-state cuts, the interactions that hadronize the jets in the hard subgraph 
cancel pT|, p^] , and then we may treat the lines in H as if they are all off-shell by order Q^. 

The gauge we are using is the light-cone gauge = 0. In this gauge, regions with extra 
gluons joining the target and hard subgraphs in Fig. |1] are power-suppressed. 

Much of the literature treats factorization in terms of mass singularities. To see the 
relation to our treatment, suppose that we were to take a limit of the structure function 
in which all light quarks and all external lines are massless. The target momentum would 
become light-like: p'^ — > (p"'",0, Ot), so that there would be collinear and infra-red diver- 
gences. The infra-red divergences cancel after a sum over the different possible graphs and 
final-state cuts at a given order of perturbation theory, leaving only the collinear divergences 
associated with the target. These occur ||2^ at momentum configurations symbolized again 
by Fig. |l|, but where momenta in T are exactly proportional to the target momentum, i.e., 
they are of the form fc^ = {k~^, 0, Ot). There is an exact correspondence between the leading 
regions (for any m) and the location of the singularities for m = 0: the leading regions are 
just neighborhoods of the positions of the singularities. Moreover the counting of powers of 
Q corresponds to the degree of divergence of the singularities. 



^ We use the mathematicians' big O and httle o notation: 
A = 0{Q) means that A is of order Q in the limit Q — > oo. 
A = o{Q) means that A/Q — > in the limit Q — > oo. 
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However, in the true theory there need not be any actual divergences. For example, in a 
non-QCD model we could endow all the particle with masses, and our proof of factorization 
would remain correct. In QCD there are divergences that are associated with the necessary 
masslessness of the gluon, but only if we make perturbative calculations with on-shell exter- 
nal gluons or quarks. In the real world, these divergences are cut off by the non-perturbative 
effects of confinement. All the real particles of QCD are massive. The singularities in the 
massless limit merely provide a convenient tool for classifying regions of momentum space. 



C. Elementary treatment of factorization 

The factorization theorem can easily be motivated from Fig. |^, as we will now show. We 
will construct an approximation to a proof of the theorem that will introduce a number of 
useful ideas. The proof will be exactly correct in a super-renormalizable theory, where the 
single important region is given in Fig. |. In that case the proof is equivalent to the argument 
given by Landshoff and Polkinghorne for the parton model The greater detail given in 



the present paper will enable us to make precise operator definitions of parton densities. In 
addition, we will introduce some notations and auxiliary concepts that will be useful in the 
full proof. 

The hypothesis on which the approximate proof rests is an assumption that important 
momenta can be classified as belonging to either a region of hard momenta (that belong 
only in H) or a region of momenta coUinear to the initial hadron p (that belong only in T). 
We will need to assume (not quite correctly) that the momenta collinear to the target have 
virtualities that are fixed when Q becomes large, and more specifically that the orders of 
magnitude of the components of a target momentum are {Q, m^/Q, m), where m is a typical 
light hadron scale. 

Given this hypothesi^, each graph can be decomposed unambiguously into a sum of 
terms of the form of Fig. |I]. Thus we can write 

F = ^ F + non-leading power 

graphs r 

= ^ ^ H{R) ■ T{R) + non-leading power, (2) 

graphs r regions R 

where the summation over F is restricted to those graphs that are two-light-particle reducible 
in the t-channel and that therefore have at least one decomposition of the form of Fig. |l|. 
A region of such a graph is completely defined by its hard and target subgraphs, so we can 
replace the sum over graphs and regions by independent sums over graphs for H and T: 

F = H ■ T + non-leading power. (3) 

Here H and T are the sum over all possibilities for the H and T subgraphs in Fig. |I], with the 
momenta being restricted to the appropriate regions. The symbol ■ represents a convolution. 



^ Incidentally, this hypothesis excludes heavy quarks from consideration at this level of treatment, 
an error which we will remedy later. 
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the integral over the 4-momentum hnking H and T and a sum over the flavor, color and 
spin indices of the lines joining the two subgraphs. Thus we have 

H-T = j:J0^,H,{q,k)Uk,p). (4) 

Recall that we deflned T to include the full propagators on the two lines that connect it to 
H, so that H is amputated in these same two lines. 

To get the factorization theorem, we use the observation that some of the components 
of the loop momentum can be neglected in H, and also that some of the components of the 
trace over spin labels can be neglected. In the H factor in Eq. (^), we may neglect both 
k~ and k^, since all the lines in H are effectively off-shell by order Q^. This results in an 
error that is suppressed by one or two powers of Q. Thus we can approximate the structure 
function by: 

di f , + n\ f dk^ (fkT + /, \ 
Jx y ^ ^' ^ J (27r)4 T{k,p) + non-leadmg power, (5) 

Here, to make contact with the standard usage in this subject, we have written k~^ = ^p^ 
and have changed variable from k^ to ^. 

In Eq. there is an implicit sum over the spin indices and the flavor of the lines joining 
T and H. Suppose the line is a quark. Then we can decompose each of H and T into a sum 
of Dirac 7 matrices. The leading terms involve a 7" in the target subgraph T since that can 
be contracted with the largest momentum components in T, which are the + components. 
Thus the most general form of the part of T that gives the leading power is a sum of terms 
proportional to 7", 7~75 and 7~7r- 

For the simple case of unpolarized scattering, only the 7" term contributes, and we can 
writeQ 

_ ^ f dC, _ r dk~ d'^kx ^ +1 , , , , . 

F = y —r ti Haj / — ; — TT — c,p - tr 7 Ta + gluou terms + non-leadmg power, (6) 

with a similar decomposition being applied to the gluon term. Here a labels the different 
flavors of quark and antiquark. (Note that in the usual applications, H and T are diagonal 
in quark flavor and only a single flavor index is required, the same for each of the lines 
joining H and T.) A similar result applies when H and T are joined by gluon lines. 

It is convenient to represent this formula in a convolution notation with the aid of a 
projection operator Z\ 

F = H ■ Z ■ T + non-leading power. (7) 

Z represents the operation of setting kx = k^ = for the momentum of the external parton 
of the hard scattering and of picking out the largest terms in the spin indices coupling the 
hard and target subgraphs. It is a sum of quark and gluon terms. The quark term is: 



^ Generalization of the results to the polarized case results in purely notational complications, as 
regards the proof of factorization fsH] . 
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Zaa'Mk. h 1st defn.) = (2vr)^5(A;+ - t)6{k-)6^^\\^T). (8) 

This and similar objects will be used repeatedly in our work. It is readily verified that Z is 
a projection, i.e., 

Z^ = Z, (9) 

and hence, for example, {1 — Z) ■ Z = 0. The label '1st definition' in Eq. (§) indicates that 
a modified definition, which we will now give, is superior. 

In fact, the above definition of the projector Z is suitable for massless quarks. Its use 
in Eq. (0) remains valid when the quarks in H have non-zero mass, but it is not perfectly 
convenient for practical calculations^. For example, calculations of the short- distance coef- 
ficient functions do not satisfy exact gauge invariance, because the external lines of H are 
off-shell. Therefore it is convenient to replace Eq. by a definition in which the external 
quarks of H are put on-shell. This involves replacing k by an on-shell momentum 

F = (ep+,mV2ep+,0T), (10) 
and using the Dirac matrix for on-shell wave functions: 

Z^a.-,i3f3'{k, I; massive quark) = ^"^^"^V^ '" 7^/3'(27r)^(5(fc+ - l+)5{k- - my2k+)5^''\kT). 

(11) 

The resulting leading-power approximation to F is 

H-Z.T = ^[^ tr / 'Jg^ tr I^T + gluou terms. (12) 

Here k^ is the approximated momentum, ([T0|). Notice that although the external parton 
lines of H are put on-shell, this is not true of the corresponding external partons of the 
target subgraph T; these are integrated over all values of k~ and /ct in the coUinear region 
of momentum. 

The change in the definition of Z for massive quarks does not affect the factorization 
theorem Eq. (^. To see this, observe that the change of definition only changes small 
components of the momentum k and of the 7 matrices attached to H. Thus we have only 
made an error similar in size to the power-suppressed error that we already induced by 
making an approximation in the first place. Also the algebraic property Z^ = Z, which we 
will make frequent use of later, is unchanged. 

Since the operation Z projects out the integral over k~^, Eq. gives the structure 
function as a convolution of a hard scattering coefficient and parton densities: 



Observe that in conventional treatments of factorization, it is normal to set quark masses to 
zero in the hard scattering. Precisely because we wish to treat heavy quarks, we do not at this 
point choose to set quark masses to zero. 
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F = F ^ f + non-leading power, 



(13) 



The symbol ® represents a convolution in the ^ variable^, together with a sum over quark 
flavors and over the gluon. It will also include a sum over the spin degrees of freedom if 
polarization-dependent effects are being treated. 

The parton densities can be expressed in their usual form [^] as matrix elements of 
light-cone operators. A quark density is then 

Given that we obtained the factorization theorem by decomposing momentum space into 
a hard region and a coUinear region, the integral in Eq. (H) is restricted to the collinear 



region. When we provide a more correct proof, we will remove the restriction to collinear 
momenta, so that the definition of a parton density is exactly as a matrix element of a bifocal 
operator on the light-cone. 

From the definition of Z, Eq. (0), it then follows that that the hard scattering coefficient 
is computed from H by contracting with the Dirac matrices appropriate for an external on- 
shell fermion, with a spin average: 

F = trH ^\'^ . (15) 
The factor of 1/2 means that F has the normalization of a spin-averaged cross section. 



D. Why the simple derivation does not work 

The above derivation of the factorization theorem would be valid if one could use a fixed 
decomposition of momentum space into regions appropriate for H and T, at least up to 
power-suppressed terms. This assumption is in fact true in a super-renormalizable theory, 
and the above derivation then leads to the parton model. Only the lowest order graph for 
H gives a leading contribution in this case. Fig. ^. This kind of reasoning led Feynman to 
formulate the parton model [^ . 



Unfortunately the error estimates obtained from the above argument, in a renormalizable 
theory, are of a relative size that we represent as of order [T/Hy. Here we use T to represent 
the largest virtuality in the subgraph T, we use H to represent the smallest important 
virtuality in H, and p is a fixed exponent. In a super-renormalizable theory there are 
leading power contributions only when the virtualities in the subgraph T are of order a 
hadronic mass (squared), so we get an excellent error estimate^. But in renormalizable 
theories, including QCD, there are logarithmic corrections that cover the whole range of 



9F®/^/de/ei^(x,e)/(o 

This fact is established from the same power-counting rules that show that all regions of the 
form of Fig. [l| are leading in a renormalizable theory. 



15 



virtualities from a hadronic mass up to Q. Thus the only simple estimate of the errors 
is that they are of relative order unity, with perhaps only a logarithmic suppression: the 
maximum virtuality in T might only be a little less than the minimum virtuality in H. A 
more powerful argument is needed to get a good proof of a theorem of the form of Eq. (|13|), 
with relative errors of order {A/QY, where A denotes a typical hadronic infra-red scale. 

In addition, when we have heavy quarks, the proof does not give us a factorization 
theorem that applies uniformly for any value of Q larger than or of order of the quark mass. 
If Q is much larger than M, the proof gives a factorization of just the same form as with 
light quarks. If Q were of order M, then we would have to restrict the lines joining H and 
T to be light partons, and then to use the methods of Sec. ^ below. But the proof would 
be unable to give an optimal error estimate in the intermediate region. 

V. PROOF OF FACTORIZATION WHEN Q>M 

Even with its defects, the reasoning in the previous section contains a core of truth, 
which we will now use as the basis for a correct proof. 
Our aim is to prove 

F = F (E) f + remainder, (16) 

with the following properties: 

• The coefficient function F{x/C,, Q^, M^) is infra-red safe: it is dominated by virtualities 
of order Q^. 

• The parton density / is a renormalized matrix element of a light-cone operator. 

• The remainder is suppressed by a power of A/Q. 

• This suppression is uniform over the whole range Q ^ M, so that, for example, there 
are no 0{M/Q) terms. 

This theorem looks just like the result (p!3| ) we tried to prove by elementary methods, except 
that the precise definitions of the factors are different. 



A. Expansion in 2PI graphs 

To utilize the result in Fig. |l], it is convenient [0 to decompose the structure function 
in terms of two-particle irreducible amplitudes. Fig. ^: 



F=J2Co-iKor-To + D 
1 



n=0 

Co- 



(17) 
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#rungs = 





FIG. 5. Decomposition of structure function in terms of 2PI amplitudes. 



The notation^ Cq and Kq are the same as in Ref . . Each of the amphtudes is two-particle 
irreducible (2PI) in the horizontal channel (i.e., the t-channel), except for the inclusion of 
full propagators joining the amplitudes. Thus D is the 2PI part of the structure function, 
while for the reducible graphs, Cq is the 2PI subgraph to which the currents couple, and 
To is the 2PI subgraph to which the target hadron couples. Both Kq and Tq include full 
propagators^ on the left side, and consequently Cq and Kq are amputated on the right, 
just as in Fig. |l|. In principle this is a non-perturbative decomposition. The intermediate 
two-particle 'states in the t channel', between the Cq, Kq, and Tq factors, include all flavors 
of parton, including heavy quarks.f^ 



B. Construction of remainder 

It turns out to be convenient to first construct what will turn out to be the remainder 
in Eq. (|T^) . This is defined by the following formula 



r = ^ Co ■ (1 - ■ [Ml - Z)f -To + D 

n=0 



1 - (1 - Z)Ko 



'l-Z)-To + D 



Co ■ (1 - Z ) 



1 



1 - Koil - Z) 



(18) 



with Z being defined by Eq. ([TlD . This formula is obtained from the formula Eq. (0) for 
the structure function by inserting a factor 1 — Z on each two-particle intermediate state 



The subscript o in Co) Kq and To is used because we will want to define some related but 
different objects later, with the same primary symbol, and we will in particular wish to reserve the 
unadorned symbol C for the short-distance coefficient. 

Strictly speaking, this means that to call the amplitudes 2PI is not quite correct. 

In the case that the external hadrons are replaced by quarks or gluons, we will have L) = and 
To = l. 
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in the t-channel. This, as we will show, gives a power suppression. The 2PI part, D, is 
non-leading since all the leading regions. Fig. |l], are associated with two-particle reducible 
graphs. The 1 — Z factors may be considered as providing subtractions that cancel all the 
leading regions. That is, if we start with the decomposition Eq. ([T7|) of the full structure 
function and subtract off all leading contributions, then we end up with Eq. (p^. 

Once we know that r as defined above is power-suppressed, we will be able to use the 
methods of linear algebra to construct a factorized form for F — r. This will be sufficient to 
give the factorization theorem together with all the desired properties. 

Now, leading contributions to the structure function come from regions of the form of 
Fig. |1|. At the boundary between the hard and target subgraphs, inserting a factor of the 
operator Z gives a good approximation. Hence an insertion of a factor 1 — Z produces a 
power suppression. Inserting a factor 1 — Z at other places does not increase the order of 
the magnitude of the graph.|^ Since we have put a factor 1 — Z at every possible position 
of boundary between hard and target subgraphs, we obtain a power suppression for every 
term in Eq. ([T8|). 

To be more concrete, suppose that we have a region of the form of Fig. |1[ The insertion 
of a factor 1 — Z at the boundary between the region's hard subgraph and its target subgraph 
gives a suppression by a factor of order 

/highest virtuality in . 
I lowest virtuality in H I ' 



as follows from the arguments in Sec. IV C 



Furthermore, let us observe that in the left-most rung, closest to the virtual photon, we 
have virtualities of order Q^, while in the right-most rung, closest to the target, we have 
virtualities of order A^. Within a given rung, the leading power contribution comes where 
all the lines have comparable virtualities, since leading power contributions only occur when 
the boundaries of very different virtualities are as in Fig. |l[ Given that in Eq. ([ISD we have 
a factor 1 — Z between every 2PI rung, there is a suppression whenever there is a strong 
decrease of virtuality in going from one rung to its neighbor to the right. Thus we find that 
Eq. (|TH|) has an overall suppression of order 

, p 

(20) 

when it is compared to the structure function itself Eq. (pIT]). 

This suppression of course gets degraded as one goes to higher order for the rungs, since 
the lines within Kq can have somewhat different virtualities. The larger a graph we have for 
Kq, the wider the range of virtualities we can have without meeting a significant suppression. 




Except that certain ultra-violet divergences may be introduced. We will see later that are 
divergences when one separates the terms in Eq. ( p!8| ) with the 1 and the Z factors, but that there 
are no divergences in Eq. ( |l8|) itself. 
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FIG. 6. Second term of third line of Eq. (^l]). 




FIG. 7. Induced UV divergences in r are in subgraphs of the form of U in this diagram. 



C. Induced UV divergences 



The above argument shows that the quantity r, as defined by Eq. (0), is power- 
suppressed in all the regions of momentum space that are relevant for the structure function 
F. However, the existence of terms containing factors of Z in Eq. (|TBp entails some extra 
regions. These regions have the potential of not only being unsuppressed but also of giving 
UV divergences. 

The lowest order non-trivial example is given by the n = 1 term: 

n = Co-{l-Z)-Ko-{l- Z)T, 

= Co ■ i^o • (1 - Z)T^ - Co-Z-Ko-il- Z)To 

= Co-Ko-il-Z)n - Co-Z-K,-n + Co-Z-Ko-Z-To. (21) 

In the second term on the last line, the factor Z ■ Kq ■ Tq is a contribution to the matrix 
element of the bifocal operator defining a parton density. Fig. ^ There is a UV divergence 
when the and k~ in the loop(s) comprising the operator vertex and the rung Kq go to 



infinity. The divergence is in fact cancelled by the last term in Eq. (21). To see this, observe 
that the two terms combine to give the second term on the second line. The 1 — Z factor 
gives a power suppression of the potentially divergent region, and the proof is the same as 



we used to obtain the suppression proved in the previous subsection. Look ahead to Sec. ^ 
to see a concrete example illustrating the above manipulations. 

A general proof of the cancellation of the induced UV divergences immediately suggests 
itself. The regions that give the possible divergences arise from regions of the form shown 
in Fig. ^ There, the insertion of a Z factor between two rungs has given an operator 
vertex, through which can flow ultra-violet momenta. The proof of cancellation of the UV 
divergences is simply that the 1 — Z factors to the right suppress the regions giving the UV 
divergences. 



D. Factorization 



We now derive a factorization formula for the structure function by showing that r is 
equal to the structure function minus the factorized term in Eq. ([16|). Starting from Eqs. 
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(0) and (|T8D, we find 



F - r = Co 



1 



[l-Z) 



■Tn 



1 - i^o 1 - (1 - Z)K^ 
Co ■ ^ ■ [1 - (1 - Z)Ko - (1 - Z)(l - Ko) 



Co 



1 - (1 - z);ro 

1 



■Tn 



1 - (1 - Z)Ko 1 - Ko 



To. 



(22) 



This proof is very similar to some proofs in Ref . |T8| or |Q . It consists of some ordinary linear 
algebra, which is valid since Z and Kq are just linear operators on the space of 4-momenta. 
The form of the right-hand-side of this equation is that of the factorization theorem. Aside 
from a normalization, the factor Z ■ [1/(1 — Kq)] ■ Tq is exactly the matrix element that is a 
parton density, and then the remaining factor is the short-distance coefficient function. 
The only complication is the presence of UV divergences of the form discussed in Sec. 



V Q . There are divergences in the parton density factor Z ■ [1 /{1 — Kq)]-To on the right-hand- 
side of Eq. (^) There are also divergences in the coefficient function Cq ■ [1/(1 — (1 — Z)Ko)]. 
Of course, these divergences cancel, since the left-hand side of Eq. (|2^) is finite, as we 
have already proved. For the moment, let us just apply any convenient UV regulator, e.g., 
dimensional regularization. We will show later how to reorganize the the right-hand side of 
Eq. ( p2D in terms of UV finite quantities. 

Given that there is a regulator, so that everything in Eq. ([2^ ) is well defined, we define 
a bare coefficient function 



Cb — Co 



and a bareQ operator matrix element 



1 



1 - (1 - Z)Ko 



(23) 



Ab = Z- 



1-Kn 



■To. 



(24) 



This differs slightly in normalization from the parton densities defined in Eq. (|1^, since Z 
contains a \{k^'~^^ + m) factor that we will ultimately put in the coefficient function. Other 
than that, the matrix element in Eq. (^) is the same as the parton density defined in Eq. 
(0) when the momenta are unrestricted, which was not the case in our derivation of Eq. 



Our use of the terminology 'bare parton density' has nothing in common with the usage in 
some other literature ||8,1^, 19 1. In the present work, and in Ref. |^l|, the word 'bare' is used to 
denote a quantity that has ultra-violet divergences that have not been cancelled by renormalization. 
In ||8|,|l8|,^, the word 'bare' refers in some undefined sense to parton densities that are convoluted 
with unsubtracted partonic cross sections, and divergences in such a quantity are infra-red, not 
ultra-violet. See Sec. XIII Q , where we examine Zimmermann's methods, for a way of giving 
meaning to such formulae. 
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From Eq. (^), together with the property that r is power suppressed, follows the factor- 
ization theorem 

F = Cb ^ Ab + non-leading power. (25) 

Except for the subscripts, this equation has the same form as Eq. ([T3|). As in that equa- 
tion, we have replaced the symbol ■ for convolution in 4-momentum by the symbol (8> for 
convolution in fractional + momentum. The differences between the two factorizations are 
that in Eq. ( ^5|) the integrals defining the parton density and the coefficient are unrestricted. 
Instead, the coefficient function, Eq. (PBD, has factors of 1 — Z placed between the 2PI rungs. 
As we will see in an example in Sec. these factors have the effect of making subtractions 
that prevent the double counting of the different regions and of forcing the momenta in the 
integrals for the coefficient function to be in the hard region of virtuality of order Q. In 
contrast to this, the integrals in our first approximation to a factorization theorem, Eq. (jTH]), 
are restricted to particular regions. Moreover, for the new form of the factorization equation 
we have an explicit estimate of the error, Eq. (pO|). 

The bare matrix element Ab is exactly a matrix element of a particular bilocal light-cone 
operator. This follows from the fact that it is defined as an integral of the form of Eq. (|14|), 
with unrestricted integrals over k~ and ky. 



VI. EXAMPLE 

To understand the meaning of the above derivation, it is convenient to examine a simple 
set of integrals that have the same structure. 

First, we observe that all the equations can be written as a sum over powers in Kq, and 
that equations are true for each power of Kq separately]^ Thus we can write the first few 
terms in the structure function Cqytk^Tq as 
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The last term in each line is a power-suppressed and finite remainder term, the contribution 
at the appropriate order in Ko to the remainder r defined in Eq. (|18|). The other terms are 
each a contribution to the coefficient function in Eq. ( p3|) times a contribution to the matrix 



Note that Ko can be expanded in powers of the strong coupling Og, so that this expansion is 
related to the ordinary perturbation expansion. 
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element in Eq. (p^. (I have used Z"^ = Z and then the square-bracket notation to make 
this structure more manifest.) 



A. Model 



Now let us make a simple mathematical model that has all the relevant structure. We 
replace integrals over 4-dimensional momenta by integrals over a 1-dimensional variable that 
runs between and oo, and we remove all labels for the flavor and spin of the partons. We 
also set the fully 2P1 part D of the structure function to zero. Then we define 



Ko{k,l) 



Q 



Q + k + m 

a. 



k + I + m 

The motivations for these formulae follows: 

Q corresponds to the external photon momentum of deep-inelastic scattering, m corre- 
sponds to a quark mass (heavy or light), and k and / correspond to the loop momenta 
coupling neighboring rungs in Eq. ([T7|). 



Co{k) is an analog of a lowest order graph for the hard part in Fig. |I|. In deep- inelastic 
scattering, it has a propagator that depends on a loop momentum k plus a hard 
momentum q. This is modeled by the denominator Q + k + m. The factor Q in the 
numerator is inserted to provide a convenient normalization: Cq ^ 1 as Q —>■ oo. 

Ko[k, I) is an analog of the lowest order graph for a rung. The lowest order graph for Kq in 
Eq. (p^ ) has a dependence on a difference of external momenta, k and /. To make 
a simpler mathematical example, we have replaced k — I hj k + I. To symbolize the 
analogy with a rung, we have put in a factor of the strong coupling a^, just as we 
would have for the lowest order rung in QCD. To ensure that the analogy is with a 
renormalizable theory, Kq is defined in such a way that the coupling is dimensionless. 

To(A;) is given an extra power of + m) compared with Kq. Then it gives a finite result 
when integrated over all k, just as happens for Tq in real QCD. We could have used 
To = l/{k + p + m)^, with p being like an external momentum. But this would have 
been an irrelevant complication. 



In each denominator in Eq. (pOD , m is meant to be like a mass term. Just as in QCD we get 
a logarithmic infra-red divergence when we have an integral over KQ{k, I) with respect to k, 
and we replace I and m by zero. 

The mathematical structures we get are of the same form as in QCD, but we will be 
able to present simple formulae. For example, there is no longitudinal -f- component of 
momentum to integrate over in the factorization formula. 

To obtain examples of heavy quark physics, we can replace m in Cq and/or some of the 
Ko's by M. 
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B. Lowest order 



The lowest-order term in the structure function F is 

_Q 1 

Q + k + m {k + m 
When Q ^ oo, k remains finite, and the asymptote is 



Co ■ r„ = / , 7 , _ „, , (30) 



Up to power suppressed factors, this is just the lowest order coefficient function Cq ■ Z times 
the lowest order matrix element Z ■ Tq: 

Here the operator Z{k, I) is just 6{k). That is, we get Cq - Z -Tq from Cq ■ Tq by setting k = 
in the Cq factor. 

If we take Q oo with m fixed, the leading power behavior is obtained by setting m = 
in the coefficient function: Q/{Q + m) 1. 



C. NLO term 



The next order term is 

■ ■ ^° = r r "^^ Q+t+m kTiT^ WW ^^^^ 

There are two simple regions that give a leading power Q^: (a) k and / of order m, and (b) 
k of order Q with / of order m. In addition the region Q ^ k ^ I ^ m interpolates between 
the two simple regions and gives a logarithmically enhanced contribution of order InQ. This 
last region gives the leading logarithm approximation. It can be checked that the leading 
power contributions are all from the region where / ~ m. 

To derive the factorization formula expanded to order Kq^ we decompose Co ■ Kq as 
follows: 

Cq ■ Kq ■ Tq = Cq ■ Z ■ Kq ■ Tq 

+ Cq-{1-Z)-Kq-Z-Tq 

+ Cq-{1-Z)-Kq-{1-Z)-Tq, (34) 

just as in Eq. (^). We can explain the right-hand-side of this equation as being obtained 
by a series of successively improved approximations for the leading behavior as Q oo.. 

The first term on the right is the lowest-order coefficient times the one-loop matrix 
element: 
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It gives a good approximation to the original integral Eq. (|33D in the region where k and I 
are of order m. Its accuracy gets worse as k increases. Furthermore, we have an ultra-violet 
divergence when k ^ oo, since the extra convergence at large k given by the Q/{Q + k + m) 
factor in (|33D is removed by the approximation. In the real factorization theorem in field 
theory, the divergence is the normal UV divergence associated with the insertion of the 
vertex for a composite operator (such as ipj^ip). To define the integral in Eq. (|35D we must 
implicitly apply an ultra-violet regulator. The regulator can be removed if we apply suitable 



renormalization, as we will show in Sec. VI E 



The poor approximation as k increases towards Q is remedied by the second term in Eq. 
(p^), the one-loop coefficient times the lowest-order matrix element: 

c„.(i-z).A„.z.r„^f ...f (36) 

This can be thought of as a term Cq ■ Kq ■ Z ■ Tq, which gives a good approximation when 
k Q, together with a subtraction term —Co ■ Z ■ Kq ■ Z ■ To, which prevents double-counting 
from the previous term, (^). The subtraction term suppresses the contribution to (|36|) of 
the infra-red region /c <C Q, so that the one-loop contribution to the bare coefficient function 

Q Q \ a. (37) 




Q + k + m Q + m J k + m 

has no IR divergence in the massless limit. This term also has a UV divergence equal and 
opposite to that in ([35|), so that the sum of the two terms is UV finite. 



The structure of the subtraction terms is exactly the same as in the work of Aivazis et 
al. on calculations of coefficient functions for heavy quark processes. To get a more exact 
analogy to that work, one could change Cq to Q/{Q + k + M), i.e., one could replace the 
light quark mass in Cq by a heavy quark mass. This mimics the effect of a heavy quark loop 
at the left-hand end of the diagram (confined to Cq). It is left to the reader to check that 
all the statements we make about the asymptotic behavior remain true in this heavy quark 
example, provided only that Q is large compared to the light quark mass m, and that Q is 
roughly at least as large as the heavy quark mass M. That is the remainder is suppressed 
by m/Q rather than just M/Q. 



D. NLO: remainder 



The third term on the right of Eq. ( p4D is the remainder. It is simply the left-hand-side 
minus the first two terms. The fact that the sum of the first two terms gives the full leading 
power, complete with its logarithm, is demonstrated by showing that the remainder. 



/•oo roo / 

Cq-{1- Z)-Kq-{1- Z)-Tq = dk dl \ 



Q Q 



Q + k + m Q + m) 

"•^ ^ ^ 2' (38) 



k -\- m k + mj {I + m) 



24 



is power suppressed. To see this, we observe that the potentially leading contributions, when 
k Q and / ~ m are cancelled by the subtractions.^ There is a possible UV divergence 
as — >■ oo, but this is cancelled by the subtraction in the second factor. This subtraction 
suppresses the region k ^ I, and it is as effective at suppressing the region for the ultra- 
violet divergence, viz. /c — > cxd, as it is at suppressing the original region it was designed to 
handle, ~ Q. 

E. NLO: renormalization 

Next, we perform renormalization in the two terms contributing to the leading power. We 
can remove the UV divergence in each term separately by adding suitable counterterms; in 
the factorization theorem this would amount to defining renormalized composite operators, 
a procedure we will implement in Sees. |VII A| - |VI1 C[ A convenient method of construct- 



ing counterterms is subtraction of the asymptote |3^. So we can define the lowest-order 



coefficient times the renormalized two-loop matrix element to be 

. ,C. . Z ^ f ... /; . - (30, 

In field theory, a sensible counterterm to a subgraph is a poljTiomial in the external momenta 
of the subgraph. If we use minimal subtraction, the counterterm is also polynomial in masses. 
The degree of the polynomial is equal to the degree of divergence. In our toy example, this 
means that the counterterm has to be independent of / and m. The counterterm as6{k > 
fi)/k does indeed satisfy this criterion. The 6 function is needed to prevent there from begin 
an infra-red divergence in the counterterm, and the arbitrary parameter fi has the function 
of a renormalization/factorization scale, just as in conventional minimal subtraction. 
It now follows that the renormalized one-loop coefficient function is 



R{Co-{l-Z)-Ko-Z)= f 

Jo 



oo 

dk 



Q Q \ as , Q as6{k > /i) 



Q + k + m Q + mJk + m Q + m k 

(40) 



which is multiplied by the one-loop matrix element dl/{l + m)^. The counterterms in 
the above two terms are equal and opposite, so that the sum of the two renormalized 
contributions to the leading power is the same as the sum of the bare terms. Notice that 
if we choose the factorization scale fi to be of order Q, then the integral in the one-loop 
coefficient function is dominated by k of order Q. 

F. Zero mass limit of coefficient function 

Finally, we observe that the coefficient function has a finite m — > limit. (The coefficient 
function is the sum of the lowest order term Cq- Z = Q/{Q + m), the one- loop term Eq. (ffOj), 
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Q ^ k includes the regions k ^ Q and k Q. 
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uv 



FIG. 8. Regions of momentum integration that give the UV divergences in the operator matrix 
element defined by Eq. 



and higher-order terms. In a field theory, the existence of the zero mass limit implies that 
the coefficient function is infra-red safe and is a symptom of the perturbative computability 
of the coefficient function in QCD when Q is large. 
For example, the massless limit of Eq. (EO) is 



oo 

dk 








as9{k > /i) 
k 



(41) 



The infra-red divergence (at /c = 0) in the term / dk-^^^^ is cancelled by the subtraction 
in the first term. The subtraction is designed to cancel the region where /c -C Q, and this 
includes the region of the possible infra-red divergence. 

One reason for emphasizing the zero mass limit is that calculations become algorith- 
mically much simpler, especially for the analytic evaluation of Feynman graphs. But our 
derivation shows that a non-zero mass may be left in the calculation of the coefficient func- 
tions, as would be appropriate if the mass is not sufficiently small compared with Q. 

VII. USE OF RENORMALIZED PARTON DENSITIES 

We now return to the factorization theorem in field theory. 

A. Renormalization of operators 

To construct the final form of the factorization, we will re-express the bare factorization 
theorem, Eq. (^), in terms of the matrix elements of renormalized operators. These op- 
erators have no UV divergences, unlike the bare operator matrix elements defined in Eq. 



Now, the divergences come from regions of the form shown in Fig. |. This figure is very 
reminiscent of Fig. |I|, for the very good reason that the derivation of the associated regions is 
essentially identical for the two cases. We will choose to renormalize the divergences in the 
MS scheme using dimensional regularization. As we will see, the fact that the counterterms 
in this scheme are mass-independent will permit us to take the zero mass limit for the co- 
efficient function without encountering mass divergences introduced by the renormalization 
counterterms. Minor changes to the argument would permit the use of any other suitable 
scheme. 

To see what to do, let us first expand the bare operator matrix element, A^, in powers 
of i^o: 
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FIG. 9. One-rung graph for the matrix element. 




FIG. 10. Two-rung graph for the matrix element. 

Ab = Z-To + Z-Ko-n + Z-Ko-Ko-To + (42) 

The first term is UV finite. The second term has a divergence when the loop momentum 
k joining the operator vertex and Kq (Fig. ^ goes to infinity. It can be renormalized by 
subtracting the pole part at e = 0. (We define the number of space-time dimensions to be 
4 — e.) This gives a result we symbolize as 

R[Z -Ko-To] = Z -Ko-To - pole part {Z ■ Kq) ■ Tq 

= Z-Ko-{l-V)- To. (43) 

Here V means to take the pole part of everything to its left, with the usual modifications of 
the pole part that define the MS scheme. Although we have used a notation that suggests 
V is to be treated as a linear operator, it does not0 in fact obey all the properties of linear 
operators, in particular associativity. 

Renormalization of graphs with two or more rungs is more interesting. For example 
the two-rung graphs. Fig. |lO|, have a sub- divergence as the left-most loop momentum k 
goes to infinity; this is exactly the same divergence as in the one-rung graphs Fig. ^. It 
must be cancelled by the one-rung counterterm before we add in the counterterm for the 
two-rung divergence, which occurs when both the loop momenta, k and /, go to infinity. 
Note that there will also be UV divergences inside each rung from divergent self-energy and 
vertex graphs. These are associated with renormalization of the Lagrangian and are present 
independently of the UV divergences that we are discussing now, divergences that are due 



° Compare the remarks of Curci, Furmanski and Petronzio below Eq. (2.25) of Ref. |18|, and see 
also App. ^ of the present paper. 
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to the use of composite operators. The divergences associated with the interactions are 
cancelled by the usual collection of counterterms in the Lagrangian, so that Cq, Kq and Tq 
are finite before we convolute them together. This implies, in particular, that the Green 
functions that define these amplitudes are Green functions of renormalized fields. 

According to this procedure, the one-rung divergence in Fig. ^ is cancelled by a coun- 
terterm 



-Z-Ko-V-Ko-n, 
and so the two-rung counterterm is 

-Z-(l-V)-Ko-V-Ko-To. 



(44) 



(45) 



The important point in the definition of V is that it must only be applied to quantities (to 
its left) that are free of subdivergences. To do otherwise would generate counterterms that 
have non-polynomial dependence on the external momenta and that can therefore not be 
interpreted in terms of operator renormalization. The renormalized value of the operator to 
two-rung order is therefore 



Z-Ko-(l-V)-Ko-{l-V)-To. 



(46) 

This pattern evidently generalizes. To renormalize the operator matrix element, we 
simply insert a factor of 1 — P to the right of every Kq factor. The result is that the 
renormalized matrix element is 



A, 



n=0 

z ■ — 



Ko-il 
1 



V 



To 



Ko-H-V 



■Tn 



(47) 



The structure here is very similar to our construction of the remainder, Eq. ([T8|) . This is 
not surprising, since in both cases we are cancelling contributions from a set of regions of 
loop-momentum space that have very similar structures. 

Given that Z effectively represents the vertices for the operators that define parton 
densities, Eq. (|47|) is our definition of the parton densities, up to a trivial normalization 
factor. 



B. Operator renormalization is multiplicative 

At first sight, the above manipulations give a rather arbitrary definition of the renormal- 
ization of the operators and of the parton densities. In fact, as we will now show, they give 
a definition in which the renormalized and bare parton densities differ by a multiplicative 
factor, with the multiplication being in the sense of convolution over fractional longitudi- 
nal momentum. Therefore the only freedom is the usual renormalization-group freedom to 
change the renormalization scheme or to change the scale parameter(s) within a particular 
scheme. 
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What enables these results to be proved is the fact that renormalization counterterms 
are polynomial in the external momenta of the subgraph to which they apply. Thus the 
counterterms can be interpreted as factors times operator vertices. (The same property 
is what enables renormalization of the interaction to work.) Moreover, the fact that the 
divergences are logarithmic implies that the operator vertices are just the ones defining 
the bare parton densities. These properties can be summarized by the statement that 
multiplying V on the right by Z has no effect: 



X -V = X -V ■ Z. 



(48) 



Here X is any quantity which is free of sub divergences. 

Now we can express the renormalized parton densities in terms of the bare parton 
densities: 



A 



1 



R 



1 







V 



To 



■Tn 



1-Kn 



z -z 



(1 



V 

1 



l-Koil-V)-KoV 



Tn 



1-V 



■K.V 



Z 



■Tn 



G^Ab. 



(49) 



In the next-to-last line, we have used Z^ = Z and V ■ Z = V, to write the result in terms of 
an explicit factor times the bare operator matrix element. Then we observe that there is a 
factor Z at the left of the operator matrix element Z ■ jzk^ ' To and that the integral coupling 
it to everything further to the left only involves the -|- component of momentum. Thus the 
result has the form of a convolution over longitudinal momentum fraction, for which we use 
the symbol (S>. 
The factor 

^ ■ KoV (50) 



G = Z-Z 



is the renormalization factor of the operator defining the parton densities. We can therefore 
write the renormalized parton densities in terms of the unrenormalized ones: 



R 

i/p 



X] 



E 



j/p 



(0, 



(51) 



where we have now explicitly displayed the sum over parton flavors and the integral over 
momentum fraction ^. Let us reiterate that the word 'bare' is used in the sense of 'lacking 
UV renormalization', and has no connection with another common usage of the word in 
TP]. The renormalization factor starts with a lowest order term which is 



this context 



effectively a unit operator: 



Gij = 6ij6{^/x-l) + 0{a,). 



(52) 
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C. Factorization with renormalized parton densities 



Once we have seen that the renormahzation of the operators is muhiphcative, we can 
write the factorization theorem Eq. p5|) in terms of renormahzation quantities: 



F = Cr ® + remainder, r, (53) 

where the renormahzed coefficient function is 

Cr = Cb® G~\ (54) 

with being the inverse of the renormahzation factor G for the parton densities Ar. The 
inverse is with respect to convolution in the longitudinal momentum fraction. 

It is possible to derive a simple and very plausible, but wrong, formula for the renor- 
malized coefficient function. The derivation relies on using associativity for the pole part 
operation. We give the false derivation in App. 0, since it is instructive. 

There does not appear to be a simple closed formula for the renormalized coefficient 
function. But there is a convenient recursion relation that we will now derive. It corresponds 
to the actual algorithms used to do real calculations. 

The derivation starts from the fact that by our definition of Cr, 

Gr®Ar = Gb® Ab. (55) 

We simply expand all quantities in this in powers of Kq. Since we already know the nth 
order terms for Cb, Ab, and Ar. 

gP = G,[{l-Z)K,fZ, 

4") = z[iro(l-^)]"To, (56) 
we can obtain the expansion of Gr, which we write as 

oo 

Gr=Y.cP. (57) 

n=0 

Our problem is to find an explicit formula for the term G'r \ given the lower order terms. 
Expanding Eq. ( [53| ) to zeroth order in Kq, we find 

CoZTo = GfZT,. (58) 

This equation is true for any value of Tq, since factorization applies for any initial state. 
Hence we must have G'r' = GqZ, the same as corresponding term in the bare coefficient. 
To first order, we have 

+ gPa^^ = C«4°^ + G^^A^^\ (59) 

which gives 
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4'^ = Co(l - Z)KoZ + [CoZ] [{ZKo)V 

= CoKqZ - CqZ [zKq - {ZKo)v] . (60) 

A convenient way of formulating this is to say that the right-hand-side is the structure 
function of an on-shell quark (or gluon) minus the lower order term in the Wilson expansion 
of this partonic structure function. 

Notice very carefully the placement of the pole-part operation. It is tempting to treat 
the last term on the first line of this equation as {CqZKq)V . But this would mean that the 
pole-part operation would be applied to the whole object CqZKo, whereas it should only be 
applied to the quantity that is an operator matrix element, i.e., to ZKq\ this is indicated by 
the brackets. The incorrect method, of taking the pole part of everything, i.e., of CqZKq, 
will get different results from the correct method if Cq has any dependence on the regulator 
parameter e — see App. 0. 

For the general case, we apply the factorization theorem to a target which is a single 
on-shell parton. The structure function in this case, Fp, is obtained by setting D = and 
To = Z in Eq. ([T7|) , and it follows that the remainder term r is zero — see Eq. (^). We let 
and Ajip correspond to parton densities on a parton target 

Abp = Z-^Z, Arp = Z ^ ^Z. (61) 

Then the bare factorization theorem Eq. (p5|) becomes just0 

Fp = Cb ® Abp, (62) 

while the renormalized factorization theorem on a parton target is 

= ® Arp. (63) 

Neither of these equations has a remainder term. The coefficient function is, of course, 
target-independent; it is the same here, on a parton target, as in the factorization theorem 
on a hadron target. 

We expand in powers of Kq, and the nth term in Fp is 

n-l 

Ft^ = C^n' + Y.C^I'At;''- (64) 

j=0 



Observe that the word "parton" has just been used with two different meanings. The parton 
target is an on-shell state corresponding to one of the elementary fields in the Lagrangian. A parton 
density is a number density computed using a particular operator involving the corresponding field. 
Thus a parton density in a parton is a non-trivial but non-contradictory concept. 

'^^ Note that this equation has no remainder term even if we have non-zero quark masses, since 
we have not yet taken a zero-mass Hmit in the coefficient function. To compute the coefficient 
function for a light parton, it is normally convenient to take the zero mass limit, as we will see 
later. In that case the remainder term on a parton target will become nonzero. 
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Rewriting this equation as 

d^^=F(-^-j:cii^A%-^l (65) 

j=0 

gives the desired recursion. The nth order renormahzed coefficient is the nth order partonic 
structure function minus lower-order terms in the Wilson coefficients times partonic matrix 
elements of the operators defining the parton densities. Both the partonic structure functions 
and the partonic operator matrix elements can be computed in perturbation theory, and 
actual calculations to order exist The recursion starts at order 0, where the coefficient 
function is the lowest-order partonic structure function: the ffist non-trivial case, for n = 1, 
is exactly Eq. (|60D. 

The indices n and j can equally well be interpreted as parameterizing an expansion in 
loops (or ttg) as well as an expansion in powers of Kq. 



VIII. PARTON DENSITIES 



A. Gauge-invariant parton-densities 



Our derivation leads to a factorization theorem in which the bare parton densities are 
defined by formulae like 



/b(x) 



dy 

2tt 



-ixp^y 



(p|V'(0,i/-,OT)7+V'(0)b). 



(66) 



(The vacuum expectation value of the operator should be subtracted, so that this matrix 
element is a connected one.) In a gauge theory like QCD, this is a matrix element of a 
gauge-variant operator. The gauge to be used to define the operator is the light-cone gauge 
A'^ = 0, since that was the gauge used for the proof of factorization. In accordance with the 
derivation, the two quark fields are renormalized quark fields. However, as we saw, there 
are divergences associated with the bilocal light-cone operator, so this formula, without 
renormalization, defines a bare parton densityf^ 

As is well known, a gauge invariant form of the parton density can easily be made by 
inserting a path-ordered exponential of the gluon field: 



dy 

271 



P 



^{0,y-,OT)Pexp 



-Wo 



dy' taA+^{0,y' ,0t) 



7+V^(0) 



A better definition of a bare parton density is to replace the renormalized quark fields by bare 
quark fields. This new definition differs from the one given above by a factor of the quark's wave- 
function renormalization. The advantage of this second definition is that it is renormalization-group 
invariant, so that formal derivations of the renormalization-group equation are simpler. 
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In the light-cone gauge = 0, the exponential reduces to unity, so that the parton density 
agrees with the previous definition. Note that to get gauge-invariance the coupling and the 
gluon field in the exponential are the bare ones. 

Renormalization is performed by convoluting the bare parton densities with the previ- 
ously determined renormalization factor. 

Notice that the recursion formula, Eq. (|65|), for the coefficient function is actually gauge- 
invariant, if we interpret it as equation for terms in expansions in powers of as- For example, 
the left-hand side is the a" term in the expansion of the structure function of an on-shell 
quark or gluon, and the coefficients "'^ are terms in the expansion of the renormalized 
parton densities in the same on-shell quark or gluon state. 

B. Evolution equations 

The final element in the factorization formalism that makes it useful for phenomenol- 
ogy is the set of DGLAP evolution equations. Since the parton densities are matrix ele- 
ments of renormalized composite operators, the evolution equations are just the ordinary 
renormalizat ion-group equations for the operators. To use the factorization formula one 
sets the renormalization/factorization scale ^ to be of order Q. Then there are no large 
logarithms in the coefficient functions, for which low-order perturbation calculations are 
therefore useful. The parton densities at different scales are related by use of their evolution 
equations. 

Since we have chosen to use MS renormalization, the renormalization-group coefficients 
are independent of masses, and are in fact the ones normally used. This is true even if 
one (or more) of the quarks is heavy and has a mass M comparable with Q. Our proof 
of factorization has demonstrated that all relevant effects of non-zero quark masses can be 
found either in the coefficient functions or in the starting values of the parton densities. 

Of course, one can perturbatively compute the values of the heavy quark densities, by 
the methods that Witten p5[ first devised. In our formalism this is most conveniently done 
in association with the version of factorization that is appropriate when M is bigger than 
Q, which we will treat in Sect. 

IX. QUARK MASSES IN THE COEFFICIENT FUNCTION 

In conventional treatments of factorization, masses are set to zero in the coefficient 
functions. But our treatment has preserved masses, and this is the key to a correct treatment 
of the effects of heavy quarks. 

A. Massless limit 

The massless limit can be taken in the coefficient function. This can be done since the 
1 — Z factors in Eq. (|5^) cancel leading power contributions from all regions except where 
all the loop momenta are of order in virtuality, and except for regions that contribute 
to the (cancelled) UV divergences. Thus setting a mass m to zero gives an error that is 
a power of m/Q. A particular consequence of this result is that all potential coUinear 
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divergences are cancelled. Thus the coefficient function is a truly infra-red safe quantity. If 
the renormalization mass /i is chosen to be of order Q, then perturbative calculations can 
be made. 

Since errors in setting a mass to zero are a power of m/Q, taking the massless limit 
is sensible if all the quark masses are of the order of a typical hadronic mass or smaller; 
the errors are no bigger than errors that have been made elsewhere in the derivation of 
factorization. 

B. Heavy quarks 

However, there are quarks whose masses are larger than this (charm, etc.). Let us first 
treat the case that there is only one heavy quark, of mass M. It is not always appropriate 
to set M = in the coefficient functions, since the error in doing so is of order [M/Qy, 
which may be much bigger than the error associated with dropping the remainder term in 
the derivation of the factorization theorem. A error of order {M/QY may also be larger 
than the error caused by using a finite order truncation of the perturbation series for the 
coefficient function. 

Now, the error in the factorized form of the structure function is of order {A/QY, and 
the derivation of this error estimate is valid over the whole range of quark mass for which 
Q ^ M. This means both the region where Q is of order M and the region where Q is much 
bigger than M. The remainder term is uniformly suppressed by a power of A/Q. The sole 
effect of a heavy quark line is to restrict its virtuality to be at least of order M^, and this is 
completely compatible with the derivation of the error estimate. 

We therefore have a factorization theorem that is valid in the whole of the region that 
Q ^ M, as we have already observed. If Q is sufficiently much bigger than all the quark 
masses, then we may set all the masses to zero in the coefficient function. If some of the 
quark masses are non- negligible, then we simply leave their masses at their correct values. 

However, these considerations only apply ii M ^ Q. If, on the contrary, a heavy quark 
mass is much larger than Q; then the coefficient functions that we constructed have loga- 
rithms of M/Q in this region of relatively small Q. This is a problem we will treat in Sec. 
^ The work in this section is based on a factorization theorem derived under the condition 
that Q is at least comparable with M. 

Despite the fact that we have retained heavy quark masses wherever necessary, the 
kernels of the the evolution equations for the parton densities are in fact the same as with 
the quark masses set to zero, i.e., they are identical to the ordinary DGLAP equations in 
the MS scheme. This happens because the evolution equations are in our approach just the 
renormalization group equations for the renormalized parton densities. The Altarelli-Parisi 
kernels are anomalous dimensions, obtained from the renormalization factor Gij. Since the 
renormalization counterterms in the MS scheme are mass-independent, so are the Altarelli- 
Parisi kernels, a statement that is true not only for the leading-order as terms in the kernels, 
but for all higher order corrections. 
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C. Redefinition of the Z operation 



The analytic core of our proof is in the definition of the Z operation and in the proof 
that the remainder term, Eq. (0), is suppressed by a power of A/Q. The rest of the proof 
is simple linear algebra. It is possible to adjust to the definition of Z to make calculations 
more convenient. We have already made one such redefinition — see Eqs. (^) and (pA]). 

In the next section we will propose one further redefinition of Z that will simplify some 
calculations, by allowing heavy quark masses to be set to zero in certain parts of the calcu- 
lations of the coefficient functions. But first we must characterize the allowed redefinitions. 
We address explicitly only the momentum dependence of Z. The spin-dependent part can 
be discussed in a similar fashion. 

The first and most essential property is that Z provide a good approximation to leading 
regions, of the form of Fig. |^, i.e., that 

H ■ Z ■ T = H ■ T + non-leading power, (68) 

whenever we are in an integration region where the virtualities in H are much bigger than 
the virtualities in T. The second property is that when we go outside the momenta for 
which Z gives a good approximation, insertion of a factor of Z should not produce a result 
that is much bigger than the original. To make this precise, let H and T be subdiagrams 
that could be used in Fig. |I[ We have 

H-T = J^^H{q,k)T{k,p) (69) 

and 

H-Z-T = j^J ^^H{q,k)Z{k,l)T{k,p). (70) 

We require, with one exception, that H ■ Z ■ T should not be much larger than H ■ T. The 
exception is that we can have a logarithmic ultra-violet divergence for large 

The above properties are sufficient to ensure that the remainder as defined in Eq. (piSf) 
is power suppressed. Then we can obtain the renormalized factorization theorem Eq. (|53D 
given that any divergences in the operator matrix elements are at worst logarithmic. 

A final property is needed in order that the factorization theorem be of a usefully simple 
form. We choose this to mean that factorization involves a convolution in just one variable, 
a longitudinal momentum fraction. This forces the momentum-dependent part Z to be of 
the form 

Z{k,l) = 6^'\k^-hf{l). (71) 

Here the function f{l) must be unity when It is less than about Q and /~ is less than 
about Q^Ip^ ■ Moreover, the approximated momentum must approach (Z"'',0, Ot) in the 
collinear limit. Both /(/) and must be smooth functions. In order that the convolution 
in the factorization formula be a convolution in one variable, the approximated momentum 
must be independent of /~ and l^. 

Perhaps the simplest and most natural definition is to write 
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Z{k, I) = 5^^^ (k^^ - (A;+, 0, Ot)) e{lT < /i), 



(72) 



which is just hke Eq. except for a cut-off on the transverse momentum entering from 
the right. This definition would be favored, for example, by Brodsky |35]. It corresponds to 
defining parton densities by integrals of the following form: 



f{x,fi) = standard normalization factors x 



d% 



P 



^(-/) 7+^(0 p), (73) 



where there is an integral over all virtualities of the parton from the target and an integral 
up to a certain maximum transverse momentum, and we are using the Fourier-transformed 
fields. 

This definition suffers from two inconveniences. The first is that in a gauge theory it 
does not give parton densities that are manifestly gauge invariant. The second is that the 
evolution equations (in /i) are not exactly homogeneous equations of the Altarelli-Parisi 
form; a subsidiary expansion for large fi is needed to get the Altarelli-Parisi equations. 

Neither disadvantage is fatal, but we prefer to use a definition in which /(/) = 1, as in 
Eqs. (§) and ([TT|) . The parton densities are then precisely of the form of light-cone operators. 



and UV renormalization must be applied as described in earlier sections. 



D. Proposal for optimal redefinition of Z 

The remaining freedom in defining Z resides in what it does to the factors on its left, and 
in the definition of the approximated momentum /. The most natural definition is perhaps 
the one in Eq. (|TTD . But a simplification is possible. 

Let us first recall the classification of partons as light or heavy according to whether 
their masses are less than or greater than a few hundred MeV. Thus the gluon, and the up, 
down, and strange quarks are light, while the charm, bottom, and top quarks are heavy. The 
importance of this distinction is that it is always legitimate to neglect light parton masses 
in the hard scattering coefficients, since the errors in doing so are of the same order as the 
non- leading power corrections ('higher-twist terms') that constitute the remainder in the 
factorization formula. But it is not always valid to neglect heavy quark masses. Even if Q 
is much larger than the mass M of some heavy quark, the error resulting from replacing M 
by zero in the coefficient function is larger than the errors that result from neglect of higher 
twist terms. (In practice we normally have larger errors that result from truncation of the 
perturbation expansion of the coefficient functions, and then it will be sensible to neglect 
M at suitably high Q.) Note, however, that it is never legitimate to neglect masses in the 
parton density. 

So it is convenient to equip Z with a prescription to set light parton masses in everything 
to its left. This new operation we call Zi. Consider a convolution H ■ T like that implied by 
Fig. |l], and suppose that H and T are joined by a pair of light parton lines. We have 

H-T = J d^kH{q,k,m,M)T{k,p,m,M), (74) 

so that 
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H-Zi-T = J d^H{q,^p,0,M) J (fkTdk-T{k,p,m,M), (75) 

where p = (^^,0,0^), and, for simplicity, we have omitted the treatment of the Dirac 
matrices, which is unchanged from our earher work. We use m and M to refer to hght and 
heavy parton masses. 

In the above equations, we have assumed that the hmit of zero mass for the hght partons 
exists. This is, of course, normally not true if is a simple sum of Feynman graph, such 
as corresponds to the H subgraph in Fig. |l]. Rather H should be a quantity such as a bare 
coefficient function obtained from such a subgraph with a series of subtractions to cancel 
the coUinear regions, i.e., a quantity such as 

Just like the pole part operation, V, Z\ is not a linear operator, at least not on momentum 
space. Nevertheless it obeys enough of the algebraic rules for linear operators that the proof 
of factorization still works if we replace Z Z\. The advantage of the use of Z\ is that it 
directly implements the zero-mass limit for light partons in the definition of the coefficient 
functions. It is necessary to add to the proof a verification that the zero-mass limit is 
only being applied to quantities for which the limit exists, at all stages of the proof. The 
verification is elementary, since the dangerous regions arise from regions of exactly the kind 
that are suppressed by the 1 — Zi factors in Eq. (ffBI). We can apply the same arguments to 
the renormalized coefficient functions as well. 

In practical work, it is of course very important to take the zero mass limit wherever 
possible, since massless FejTiman graphs are generally much easier to calculate than massive 
ones. 

We now show that there are certain parts of calculations with heavy quarks where one 
can correctly redefine Zi also to set heavy quark masses to zero, even when Q is of order M. 
Let us continue to define Zi as in Eq. (|75D when the lines joining H and T are light partons. 
The light parton masses are set to zero in H, but the heavy parton masses are not. 

But now suppose H and T are joined by heavy quarks. We will now show that it is 
legitimate to define Zi to set the heavy quark mass(es) to zero in H: 

Hq-Z,-Tq = Jd^HQ{q,^p,0,0) Jd'kTdk-TQ{k,p,m,M). (77) 

Here we have equipped H and T with a subscript Q to symbolize their being joined by heavy 
quark lines. 



In Fig. 11 we show some diagrams to which Zi is applied, at the place indicated by the 
vertical line. To allow zero mass limits to be taken, we assume implicit 1 — Zi factors at 
all necessary points to the left of the vertical bar, as in Eq. (0). In the case that there is 
more than one heavy quark, one should set to zero only the masses of those quarks that are 
lighter than the quarks joining H and T. This need for this last requirement will become 
apparent in the proof. 

In the first three graphs, which have either gluons or light partons as their external lines, 
only the light quarks have their masses set to zero. But in the last three graphs, which have 
heavy quark external lines, all the quark masses should be set to zero; the external quarks 



37 



•cm 



(a) 



(b) 




'Vvv 



(c) 



(f) 



FIG. 11. Diagrams with the Zi operation apphed at the vertical hne. The heavy quarks are 
denoted by the thick sohd hnes. 

will also be given massless on-shell momenta, k"^ = 0, and the Dirac matrix will be that for 
a massless quark. 

If it is indeed valid to define Zi in this way, a substantial simplification is achieved in 
practical calculations, since it is only necessary to retain non-zero masses for heavy quarks 
in loops of heavy quark lines in coefficient functions with external light lines, i.e., in graphs 



such as the first three of Fig. 11 



The formal proof is as follows. 

1. H ■ Zi is only used when H has a zero mass limit. Hence the virtualities in H are of 
order or larger. This is simply the assertion that coUinear subtractions have been 
applied inside H, as in Eq. (|7BD. 

2. If H and T are joined by heavy quark lines, the virtuality of the heavy quark is at 
least of order in the dominant region of integration, for the whole leading power. 
The virtuality, as is well-known, is in fact space-like. 

3. In a region where the virtualities in T are much less than the virtualities in H, then 
H ■ Zi-T provides as good an approximation to if ■ T as does the approximation with 
the heavy quark mass left non-zero. The original approximation involved replacing a 
momentum of space-like virtuality of order by an on-shell momentum. Instead we 
now replace it by a light-like momentum. The new Z operation provides a suitable 
approximation given that the old operation did. Thus the first essential property of a 
Z operation is obeyed 

4. If the virtuality of the lines joining H and T is of order the virtualities in H, then 
setting masses to zero in H changes the precise value but not the order of magnitude. 
Thus H ■ Zi • T is oi the same magnitude as if ■ T in this case. The second property 
for Z is satisfied. 



38 





FIG. 12. Born graph for heavy quark in DIS. 
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FIG. 13. One-loop graph for heavy quark in DIS. 

5. The effect of Zi on T, in Zi ■ T, is the same as for Z. Thus there is no change in the 
logarithmic UV divergences that are generated. 

A more physical argument can be made with the aid of an example. Consider the lowest 
order calculation of a heavy quark loop to a structure function. In Fig. 0, we have the 
Born graph for DIS on a heavy quark that comes out of the shaded target bubble. If Q is 
much larger than the quark mass M, it is a useful approximation to replace the graph by the 
lowest order Wilson coefficient times the heavy quark density, as shown on the right of Fig. 
121 , for the important region when the quark has transverse momentum much less than Q. 
It is also a good approximation to replace M by zero in the contribution to the coefficient 
function. 

Now both of these approximations fail when Q is comparable to M (the 'threshold 
region'). But in this case the heavy quark distribution is of order as relative to the gluon 
distribution. So to do valid phenomenology we must include also a one-loop coefficient times 
the gluon density. The result is shown in Fig. |T3|. We start with a particular kind of graph 
for the structure function where a heavy-quark loop couples to the target by gluon lines. 
To avoid extra irrelevant complications, suppose that the gluons have low virtuality. The 
first part of the right-hand-side is a contribution to the coefficient function times the gluon 
density. In the one-loop coefficient there is a subtraction term. The second term on the 
right is the previously defined heavy quark coefficient function times a heavy quark density. 
In the region where the gluons have low virtuality this second term cancels the subtraction 
in the one-loop Wilson coefficient times the gluon density. 

Hence the incorrectness in the approximation used in Fig. |12| is compensated by the 
subtraction in Fig. |T^. Of course, it would have been much simpler to use the heavy quark 
(or 'fixed-flavor') scheme that we will discuss in Sec. ^ But that scheme does not permit 
us to go to large Q, because there will then be large logarithms of Q/M in its coefficient 
functions. In contrast, the scheme in Figs. |12| and [TI 
and high Q without loss of accuracy. 



permits an interpolation between low 
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At sufficiently large Q, the Born term alone provides useful phenomenology, because the 
heavy quark density is large. Moreover, zero-mass coefficient functions can be used. 

As Q is decreased towards the threshold region, the Born term in the coefficient function 



becomes increasingly inaccurate as a representation of the graph on the left of Fig. |12|. Note 



that even if we evaluate the coefficient with the correct mass there is still an error of the 
same order of magnitude as the error in neglecting the mass completely. This is because the 
horizontal lines are necessarily space-like. Replacing them with on-shell lines gives an error 
of order M^/Q"^. 

When one decreases Q, the errors in the approximation of Fig. ^ increase. At order 
the errors are compensated by the subtraction term in Fig. But beyond some point, 
the errors in the approximation become larger than the quantity one is trying to compute. 
Correct compensation of errors will involve the use of even higher order diagrams. Then one 
must abandon this scheme and use only the heavy quark scheme of Sec. 0. The important 
point is that there is an overlap in the region of validity of both schemes. 



X. FACTORIZATION WITH Q<M 

When Q is reduced below the mass M of a heavy quark, the scheme described in Sec. 
|V] becomes inappropriate. Indeed, given a fixed value of x, we go below the threshold at 
Q = 2M^x/ (1 — x) for producing the heavy quark by reducing Q enough. On the other 
hand the factorization theorem that we derived earlier has a non-zero subprocess in which 
there is production of heavy quarks in the final state, for any value of Q. An example is 
given by Fig. 0. There we replace a graph for heavy quark production by the lowest order 
approximation to the factorization theorem. The replacement of an off-shell heavy quark 
by an on-shell quark in the hard scattering enables the approximated graph to be non-zero, 
even when the true physical process is below the threshold for producing heavy quarks. 
The error in the approximation is repaired by higher-order approximations to the coefficient 
functions, as illustrated in Fig. 0. 

Clearly it is likely to be a poor and inaccurate method of calculation to obtain an 
answer that is known to be zero by adding a collection of non-zero pieces, in a truncated 
perturbation expansion. Even a little above threshold we may have inaccurate calculations: 
a cross section that approaches zero as the threshold is approached is calculated as a sum 
over terms that do not have the correct threshold behavior. 

The remedy is to use a different version of the factorization theorem, in fact the well- 
known fixed-fiavor-number scheme [jl],^. In this section we present a proof of factorization 
in this scheme in a form that will mesh with the formulation and proof of factorization that 
we gave earlier. Using the terminology introduced in Sec. |TT1|, we will say that the heavy 
quark is treated as non-partonic. It will be convenient, for the purposes of this section, to 
call this scheme the 'heavy-quark scheme'. The essence will be to treat the heavy quark as 
always being part of the hard scattering. This scheme has a range of validity that includes the 
whole region that Q ^ M. This range overlaps with the range of validity of the factorization 
theorem where the heavy quark is treated as partonic, i.e., the range Q ^ M. 

There are two important observations. One is that when Q is of order M, the heavy 
quark mass provides a large scale of virtuality that can be treated on the same footing as Q. 
The second observation is that when Q is much less than M, the decoupling theorem 
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applies. Our heavy quark scheme will satisfy the decoupling theorem in the simplest way: 
one can simply drop all graphs involving heavy quark lines and obtain a correct answer 
without needed extra finite renormalizations of the coupling and parton densities. The 
method we will use is that of Collins, Wilczek and Zee [|12|, with the heavy quark begin 



treated as non-partonic. In that subscheme, renormalization is done in the MS scheme for 
all graphs except those involving the heavy quark. For graphs with a heavy quark loop, 
renormalization is done by subtraction at zero momentum and with the light quark masses 
set to zero. The remaining renormalizations involve graphs with external heavy quark lines. 
Following Buza et al. ||^, we define the heavy quark mass as the position of the pole in the 
heavy quark propagator, a definition that makes sense in perturbation theory. Remaining 
renormalizations are defined by pole-part subtractions, in the MS style. 



This advantages of this scheme are |T2[: 

• It satisfies manifest decoupling. 

• MS and zero momentum subtraction allow preservation of Ward identities in gauge 
theories without the need for extra finite counterterms. 

• Anomalous dimensions for the active partons and the (3 function are the same as in 
the MS scheme for the theory with the heavy quark omitted. They have no mass 
dependence. 

• At no stage, in either this subscheme or the subscheme where the heavy quark is 
active (or partonic), do we have to make an expansion in powers of M/Q or Q/M: the 
heavy quark mass need never be approximated. So the scheme can be applied when 
there are several heavy quarks and the ratios of their masses are not necessarily large. 
Furthermore, there is no loss of accuracy when treating problems where a heavy quark 
is not heavy enough for it to decouple to high accuracy and not light enough for its 
mass to be approximated by zero. 

In this section we will treat the case that the theory contains one heavy quark and that 
Q > M . The most general case, that there are several heavy quarks, whose masses may or 
may not be larger than Q, will form an elementary generalization to be treated in Sec. 



We will first derive a factorization theorem without taking account of renormalization and 
then we will do the renormalization. 



A. Bare factorization theorem 

When we are in the region Q ^ M, the leading regions continue to be of the form of Fig. 
m. However, the specification of the graphs is a bit different, since heavy quark loops must 
each be contained in the hard part H or in renormalization subgraphs of T. Thus the lines 
joining H and T must always be light partons. To obtain a factorization theorem, we use 
the reasoning in Sec. |V| with two changes. 

The first change is that since heavy quarks cannot join the hard and target subgraphs, we 
change Eq. ( p!7| ) so that the amphtudes corresponding to Cq, Kq, Tq and D are two-particle 
irreducible in the light partons only. The second change is that we need to take account of 
the decoupling theorem for graphs with heavy quark loops. 
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The first change means that Eq. (^) needs to be replaced by 



F = J2Ch- {KhT -Th + Dh 

n=0 

= ■ ■ Th + Dh. (78) 

where the subscript H means that the amplitude with the subscript is 2PI only in light 
parton lines. We can formalize the definitions of Ch-, etc. by defining a projection Pl that 
is unity on light lines and zero on heavy quark lines. The projector onto heavy lines is 
Ph = 1- Pl- Then 

Dh = Do + Co- ]—r -Ph-To. (79) 

It can be verified that with these definitions, the structure function given by Eq. (^) is the 
same as before, i.e., Co ■ ' + Do- 
We define the remainder to be 

'-'' = g''- l-(l-Z)A. -'^-^'-^" + ^"- f^"' 

This remainder is power suppressed, just like the remainder r that we defined in Eq. (p^S]). 

No changes are needed in the reasoning that lead to the bare factorization theorem Eq. 
(^). We find that 

F = Chb ® + non-leading power, (81) 
where the bare coefficient function is 

and the bare operator matrix element (or bare parton density) is 

Pl-Ahb = Pl-Z- ■ Th (83) 

i - Kh 

The leading regions only have active, light partons joining the hard subgraph and the 
target subgraph. This is reflected in the formulae by the fact that there are explicit factors 
of Pl on the right of Ch, on the left of Th and on both sides of Kh- Hence we may insert 
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the explicit factors of Pl in the formulae for the coefficient function and operator matrix 
elements, Eqs. (|^) and (|83D . 

The reader should clearly understand the distinction between the following notations: 
Cq is a fully 2PI and amputated Green function for two virtual photons and two quarks; Ch 
is the same Green function as Co except for being 2PI only in light parton lines; and finally 
Chb is a Wilson coefficient: it is the full amputated Green function, including reducible 
graphs but with subtractions to make it a purely UV object. 

Contrary to appearances, the definition Eq. ( pSf ) is equivalent to the previous definition, 
Eq. (^, so that 

Pl-Ahb = Pl-Z- ■ To. (84) 

The algebraic proof of this equation, starting from Eq. (|83|) , is left as an exercise. We can 
also define the densities of heavy quarks by Ph ■ Ahb = Ph ■ Z ■ ■ Tq, but we will not 
need to use the definition here, since only light parton densities appear in the factorization 
theorem. 

At first sight it appears that the bare parton densities Ahb are identical to those in the 
previous version of the factorization theorem. This is not quite so, because we have are 
using a different renormalization subscheme for the QCD action, both subschemes being 
part of the CWZ [jl2l family of schemes. Green functions in the two subschemes differ by 



factors associated with the changes in the wave function renormalization factors. In addition, 
even without wave function renormalization, the numerical values of the coefficients in the 
perturbation expansion of Kq, etc., would differ because the numerical value of the coupling 
as differs between the two subschemes. This can all be summarized by saying that Ko, Tq 
and Co in the two subschemes differ by a renormalization group transformation. 

When we renormalize the operators, and hence construct the renormalized factorization 
theorem, we will need to work in terms of Kq rather than Kh- So we rewrite our new 
coefficient function Chb in terms of fully 2PI amplitudes. This is done quite simply by 
defining a new projection operator Zh that is zero when applied to heavy quark lines and 
that is Z on light parton lines. Then Zh = Z ■ Pl. 



Graphically, the coefficient function Chb given in Eq. (82) is Cq with any number of 
Kqs attached. If neighboring rungs are connected by active partons, then a factor of 1 — Z 
is inserted, but connections by heavy quarks are left unaltered. A straightforward but 
somewhat lengthy algebraic derivation shows that Eq. (|8^) implies that 



^^^-^°'l-(l-ZH)iro'^'^"- ^^^^ 

Observe that on an active light parton 1 — Zh = 1 — Z and on a heavy quark 1 — Zh = 1, so 
that this equation agrees with the verbal description given at the beginning of the paragraph. 



B. Renormalized factorization theorem 

Next we copy and slightly modify the steps needed to derive the renormalized factoriza- 
tion theorem. To define the renormalized parton densities, we need to use a renormalization 
scheme in which the heavy parton is treated as non-partonic. So we define 



43 



n.=0 



Z \ ^ ■ To. (86) 

1 - i^o ■ f 1 - Vh 



The renormalization is defined by Vh-, which is an operation that acts to the left. We define 
LVh as follows: If L contains heavy quark loops and its rightmost external lines are light 
partons, then LVh is the value of L(g, fc, M, m) when k~ and are replaced by zero and 
the light parton masses m are replaced by zero. If L contains no heavy lines, then, LVh is 
just the MS pole part of L. The remaining case is when we apply Vh to graphs with external 
heavy lines. There is a choice of scheme that is not determined by the overall requirements 
listed in Sec. This is similar to the non-uniqueness found by Roberts and Thorne [1T0|,P 



We will choose to define the operation to be pole-part subtraction, in the MS style, as we 
did in a similar situation when renormalizing the interactions. 

In accordance with the dictates of the BPH approach to renormalization, counterterms 
are kept with the graphs they subtract. Thus LVh is only used when L is a quantity 
for which all sub divergences have been subtracted. This also ensures |J2[ that the use of 
zero momentum subtractions for subgraphs containing heavy quark loops introduces no IR 
divergences in the counterterms. 

With these definitions, we can copy most of the previous derivation of a renormalized 
factorization theorem. First we observe the the relation between renormalized and bare 
parton densities has the form 

Ahr = Gh® Ahb, (87) 

where we use 

Gh = Z-Z ^ ■ KoV (88) 



instead of G given by Eq. ([501). Then we express the factorization theorem in terms of 
renormalized parton densities 

F = Chr ® Ahr + remainder, th, (89) 

where the renormalized coefficient function is 

Chr = Chb®G-h\ (90) 

Finally, we bring in the decoupling theorem. This implies that a renormalized graph 
for Ahr is suppressed by a power of A/M if it contains any heavy quark lines. This is a 
consequence of the use of a renormalization scheme which obeys manifest decoupling, for 
both the interaction and operator matrix elements. We are assuming here that the target 
hadron in the structure function is a light hadron. One case of this result is that the density 
of a heavy quark is power suppressed in the scheme we are using in this section. This result 
only applies to the renormalized heavy quark density, not to the bare heavy quark density. 
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We can therefore restrict the renormahzed coefficient function so that its external hnes 
are hght, and the factorization theorem becomes 

F = Chr ® + power-suppressed remainder, (91) 

where now the parton densities Alfi are renormahzed parton densities in the effective low 
energy theory with the heavy quark omitted. There appears to be no simple formula for the 
remainder, and notice that the remainder is not equal to defined in Eq. (pOl). 

As before, there appears to be no simple formula for the coefficient function, but a 
simple recursion formula does exist and it corresponds to the algorithms actually used to do 
calculations. The formula is almost the same as the previous one, Eq. (|65D : 

n— 1 

cPn = Fi-^-T.C^WLRj^- (92) 

The structure function is to be computed on a light-parton target only, not on a heavy 
target, and the light-parton masses are to be set to zero. The parton density has subscripts 
LRp, whose meaning is as follows: The L indicates that A^^ip is computed with the omission 
of all graphs containing heavy-quark lines. The R indicates that it is renormahzed, and the 
p indicates the same (zero-mass light-parton) target as for the structure function. 

The one complication in proving Eq. (|92D results from the fact that in deriving the 
factorization theorem, Eq. (|9l|) on a general target, we omitted graphs for A^ji that contain 
heavy lines, but without giving a formula for the omitted terms. So the recursion formula Eq. 
(0) could be in error by similar terms, i.e.,, there might be a power-suppressed remainder 
term on the right-hand side. In fact all graphs for AiRp that include heavy quark lines 
are exactly zero when combined with their counterterms. This is because they are being 
evaluated with their external momenta at exactly the subtraction point. Hence Eq. ( ^2] ) is 
exact. 



C. Differences between heavy and light factorization 

The renormahzed factorization theorem with heavy quarks, Eq. (^), differs from the 
ffist factorization theorem Eq. ( [53| ) in two respects: 

• The sum over partons in the heavy quark factorization is restricted to light partons 
only. 

• The parton densities differ by a change of scheme. 

The ffist point accounts for our terminology of contrasting 'active' (or 'partonic') with 'non- 
partonic' quarks. In the factorization we derived for Q > M, Eq. (|^), the heavy quark is 
partonic: there is a term involving hard scattering off a heavy quark. In contrast, in the 
factorization for Q ^ M, Eq. (0), there is no such term. 

There is an overlapping domain of utility of the two schemes. This is where both Q 
and the MS scale n are of order M. In this situation there are no large logarithms in the 
coefficient functions and no large logarithms in the coefficients that relate the two schemes. 
This overlap is important because it implies that the relation between the parton densities 
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in the two schemes can be computed perturbatively. In practical apphcations it should be 
remembered that at large x, the physical threshold for heavy-quark production can be well 
above Q, and consequently the region where the two schemes have common domains of 
utility should then be appropriately biased upwards in Q. 

When the heavy quark is treated as non-partonic, its parton density is not used in the 
factorization theorem, and one might suppose that the heavy quark density does not exist 
at all. In fact the heavy quark density does exist, because one can define it by exactly the 
usual operator formula, together with renormalization (as dictated by the CWZ scheme). 
The important fact is that the heavy quark (and antiquark) densities can be expressed in 
terms of the light parton densities by a version of factorization. This is a heavy quark 
expansion for matrix elements of heavy-quark operators in light states, and the argument 



was first given by Witten |^5[ for the case of local operators. In the subscheme where 
the heavy quark is non-partonic, the result is quite simple: the heavy quark densities are 
suppressed by a power of the heavy quark mass: 

fn/p = 0(A/M). (93) 

We used this property in our derivation of the factorization theorem. 



XI. MULTIPLE HEAVY QUARKS 

Let us now suppose that we have the most general case that there are several heavy 
quarks, whose masses may or may not differ greatly, and that Q can vary over a wide range. 



A. Factorization 

In this situation, we define a series of subschemes, each of which is labeled by the subset 
of the flavors of quarks and gluon which are treated as active (or partonic). The other 
flavors in the subscheme we call non-partonic. The choice of subscheme is made according 
to the value of Q. If Q is much larger than the mass of a particular quark, then that quark 
is partonic. If Q is much smaller than the mass of a particular quark, then that quark is 
non-partonic. If Q is comparable to the mass of a particular quark, we may freely choose 
whether the quark is partonic or non-partonic. Gluons are light, so they are always partonic. 
We can deflne the scheme by saying that the ua lightest quarks are partonic. 

Factorization is derived by a minor extension of the procedure in Sec. In that section 
we had one heavy quark, which was treated as non-partonic, with the gluon and other quarks 
being treated as partonic. We simply need to replace all references to a 'heavy quark' by 
references to 'non-partonic quarks'. Thus renormalization counterterms are generated by 
MS pole terms, except for mass renormalization of heavy quarks, which is always performed 
on-shell, and except for graphs with loops of non-partonic quarks, whose counterterms are 
computed at zero external momentum and with the masses of the active partons set to zero. 
This deflnes the appropriate version of the renormalization operator that is to replace V in 
Eq. (1^ or Vh in Eq. (H). 
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In the construction of the coefficient function and the remainder for the factorization 
formula, the operation Zh must be replaced by Z^^-, which is Z when applied on an active 
quark or gluon and zero on non-partonic quarks. 

The methods used to construct the two factorization proofs readily generalize to show 
that the remainder is suppressed by a power of Q, provided that all the active partons 
have masses less than or of the order of Q. Moreover, in the perturbative expansion of the 
coefficient function, if the MS scale is of order Q, there will be no large logarithms of 
ratios of Q, /i and quark masses provided also that the masses of the non-partonic quarks 
are all larger or comparable with Q. The coefficient functions have infra-red-safe limits when 
masses of active partons are set to zero. (This applies in particular to the light quarks; their 
masses may always be set to zero in the coefficient functions.) 

B. When can the masses of active partons be set to zero? 

The setting to zero of active parton masses in the renormalization prescription is neces- 
sary to get the simplest results, for example for the renormalization-group coefficients. It is 
always legitimate. 

Moreover, if one is computing the coefficient function for a particular external quark, then 
one can set to zero the mass of this quark and of the lighter partons, as explained around Eq. 
([TTI). It is only with this prescription that the recursion formula for the coefficient function, 
Eq. ( p2D is exact. 

As an example, suppose that one is treating the charm quark (of mass iric = 1.5 GeV) 
as partonic but the bottom quark (of mass nib = 4.5 GeV) as non-partonic. This implies 
that we are treating phenomena on a scale of at least m;,. Furthermore, suppose that one 
has decided that the charm quark is not sufficiently light compared to rrif, for its mass to 
be neglected. Then in coefficient functions with external gluons, for example, one leaves 
both the masses of the charm and bottom quarks at their physical values. In contrast, in a 
coefficient function with an external charm quark, its mass may be set to zero. As explained 
around Eq. ([77|) , this may be done without loss of accuracy, since any errors are taken care 
of by higher-order coefficients with lighter external partons. 

XII. MATCHING CONDITIONS AND EVOLUTION EQUATIONS 

A. Matching conditions 

As a consequence of the decoupling theorem, the density of a non-partonic quark is 
suppressed by a power of A/M, where M is the mass of the quark, so we will normally 
approximate these densities by zero. 

Furthermore there are matching conditions between the parton densities with and 
n^ + l active quarks. The coefficients relating the parton densities are functions of the quark 
masses and fi, and have no large logarithms provided that fi is of the order of the mass of 
quark + 1. The coefficients also have infra-red-safe limits when the masses of the ha 
lightest quarks are set to zero. 
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These matching conditions, have been given in Ref. to order as and in Ref. to 
order a^. They are apphed to calculate the parton densities with + 1 active quarks from 
the parton densities with ua active quarks. The conditions are to be applied at a value of 
the renormalization scale around the mass of quark + 1. Given that we set the density of 
the quark ra^ + 1 to zero when it is non-partonic, the matching conditions give initial values 
for all + 1 quarks and the gluon which can therefore be evolved upward in scale. This 
gives an effective calculation of the density of quark + 1 in the region where it is active. 



B. Evolution equations 

We have a series of schemes labeled by the number of active quarks, ua = 3, 4, 5, . . .. In 
each scheme we have densities for the gluon and for each of the active quarks and antiquarks. 
Up to power-suppressed corrections, the densities of the non-partonic quarks and antiquarks 
are zero. The active partons evolve according to the standard DGLAP evolution equations, 
with the kernels being those of the MS with ua flavors. 



XIII. MISCELLANEOUS COMMENTS 
A. Relation to other methods of treating heavy quarks 

Calculations of heavy quark production often use what is called a fixed-flavor-number 
scheme l]!],^,^,^. This corresponds exactly to the method described in the present paper 
if the heavy quark is treated as non-partonic. (For example, it corresponds to a 3-fiavor 
scheme for charm production and to a 4- flavor scheme for bottom production.) 

Other calculations switch between different numbers of active quarks, but neglect the 
masses of the active quarks in the coefficient functions. This is a valid approximation to 
the scheme here when power corrections in M/Q are negligible, but not when these power 
corrections are important. The scheme described in this paper does not require the masses 
of active quarks to be neglected. 

I have been unable to discover the justification of the scheme proposed by Martin, 
Roberts, Ryskin and Stirling 0. 

Roberts and Thorne ||To|,|ri| appear to have a scheme similar to the one in the present 



paper. But they do not present complete proofs, and they make a number of incorrect 
or misleading statements. For example, they state that "the detailed construction of the 
coefficient functions ... is extremely difficult if not impossible." As regards the general 
formalism, the construction is exactly as difficult as in the light-quark case. The only 
computational complication is that in a calculation of the coefficient functions, heavy quark 
masses must be retained. All the necessary Feynman-graph calculations for computing the 
coefficient functions at order a] have been done in Refs. and all that remains is to 
organize them to form the coefficient function by use of the recursion relation Eq. (|65D. This 
recursion relation is of the same form as the one used to obtain the coefficient functions in 
the massless case. 
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B. Modification of the schemes 

It is possible to redefine the factorization results by a change of scheme that defines the 
parton densities. This is in effect a change of the renormalization operation that defines 
them. 

In addition, the details of the extraction of the asymptotics of the structure functions 
may be changed by redefining the Z operation. The constraints on allowed redefinitions were 
explained in Sec. [IX D| , and they are implied by the requirements for a good factorization 
scheme that were listed in Sec. |I[ The Z operations and the renormalization operation 
should not change the validity of the error estimates used in the proof of factorization. 

I consider the MS scheme to be the best underlying scheme at the present state of the 
art, since it is the scheme most commonly used for calculations of QCD corrections to hard 
processes (at least when masses are ignored). 



C. Comparison with Zimmermann's approach 



One often gets the impression that Zimmermann's derivation HS^j of the operator product 
expansion (OPE) is considered as the most reliable. However, Zimmermann does not in fact 
prove the results that we need for regular QCD phenomenology, even if we restrict to the 
case that the OPE is sufficient. (The derivation in the present paper in fact applies to 
the Minkowski space structure functions, rather than for only to the integer moments of 
the structure functions. It is to these integer moments that the OPE in its strict sense is 
restricted.) 

His results suffer from two disadvantages. The first is that his Wilson coefficients have 
divergences in the zero- mass limit. They are not infra-red safe, and further work is needed to 
put the results in a useful form for perturbative phenomenology in QCD. The second disad- 
vantage is that his evolution equations are the inhomogeneous Callan-Symanzik equations 
rather than the homogeneous renormalization-group equations that can actually be used 
in practice. The inhomogeneous term is not of a form susceptible to easy calculation, so 
further work is needed to show that to a suitable approximation, this term can be neglected. 
In Tkachov's terminology [1^, Zimmermann's version of the OPE does not give a 'perfect 



asymptotic expansion' at large Q. In contrast, the factorization proved in the present paper 
is perfect in this sense. 

In this section, we will see how Zimmermann's results can be proved by our methods, 
and that they indeed suffer from the above mentioned disadvantages. 

The algebraic steps that led to our factorization theorem are shown in Eq. (^2]). The 
strategy in organizing the manipulations was that the right-most factor of Z should be made 
explicit. Zimmermann's result can be obtained by arranging so that the left-most Z is picked 
out. This results in the following derivation: 



F-r = Co 



1 



- (1 - Z) 



1-Ko ' 'l-Ko{l-Z) 



■Tn 



Co-^-[l-Ko(l-Z)-(l-iro)(l-Z)].^-^i^.To 
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We therefore have a factorization theorem 

F = Cz ® Az + non-leading power, (95) 
where the coefficient function is 

Cz = C,—^Z, (96) 
and the operator matrix element is 

Az = Z ] -To. (97) 

1-7^0(1-^) 

Notice first that the operator matrix element Az in Zimmermann's approach is already 
ultra-violet finite: the 1 — Z factors in Eq. (pT]) provide the necessary counterterms. This 
is contrast with our approach in Sect. |V|, where some extra work was needed to express 
the factorization in terms of renormalized operators. Unfortunately, the counterterms in 
Zimmermann's approach are calculated at zero momentum, and so they suffer from di- 
vergences in the massless limit, notably for the gluons. Thus although the bare matrix 
elements (without renormalization) are infra-red finite, if the hadron state is well-behaved, 
the renormalization procedure introduces mass divergences. 

Moreover the coefficient function is ultra-violet finite, since it is just a Green function of 
two currents and two partons. In Zimmermann's work, on the OPE, the external partons 
of the coefficient function are given zero momentum; this corresponds to his use of zero 
momentum subtractions to do renormalization. The correct generalization to Minkowski 
space problems is given by the operator Z defined in Eq. (^: only the — and transverse 
components of a momentum are set to zero. Our derivation works equally well with on-shell 
renormalization, with Z defined by Eq. (pi]). 

However, Zimmermann's definition of the coefficient is not infra-red finite. One cannot 
set the masses to zero. This is the strongest reason for not regarding Zimmermann's approach 
as adequate for the problems we are interested in. It is a particular problem in QCD as 
opposed to other field theories, since the gluon is intrinsically massless. 



D. Other processes 

Exactly the same methods that have been explained here can be applied to other pro- 
cesses. Also, if there turn out to be other fields with color interactions, for example, squarks 
or gluinos, they can be treated by minor generalizations of the same methods: we have the 
choice of treating each massive field as either partonic or non-partonic. 



XIV. CONCLUSIONS 

I have given a proof of factorization for deep-inelastic structure functions including the 
effects of heavy quarks. The methods are general and include all non-leading logarithms. 
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The scheme implemented is exactly that of ACOT 0. The proof is applicable independently 
of the relative sizes of the heavy quark masses and Q, and the size of the errors is a power 
of KjQ. It can be readily extended to other hard processes. 

Although this paper is quite lengthy, its core is really quite short. The essential elements 
of the proof are: 

1. Power counting is used to prove that the leading regions have the form symbolized by 
Fig. |1]. This is a standard basic result of perturbative QCD. 



The remainder, as defined in Eq. ([T8|), is then proved to be a non-leading power. The 
proof is fairly obvious given the form of the leading regions. 

The bare form of factorization then follows from the three lines of algebra given in Eq. 

m. 



4. Renormalization of the parton densities is implemented in Eq. (^71). Then applying 
the inverse renormalization factor gives the renormalized factorization theorem. 

5. Application of the factorization theorem to a parton target gives an algorithm for 
computing the coefficient function. 

This gives the factorization theorem when a heavy quark is treated as partonic. Simple 
modifications, plus the use of the decoupling theorem, give the corresponding results when 
a heavy quark is non-partonic. 

When one is treating a heavy quark as partonic, it is valid to include the heavy quark 
in the sum over partons in the factorization formula even though it cannot really be treated 
as a parton, in Feynman's sense.|^ Errors in doing this are automatically taken care of 
by the inclusion of higher-order terms in the coefficient functions. Since the heavy quark 
densities and the light parton densities are of different sizes in the threshold region, a correct 
leading-order calculation can only be done if lowest-order coefficient functions are included 
for all possible subprocesses. The lowest-order coefficient functions are of different orders 
in as'. The quark- induced processes have a lowest order 1, and the gluon induced process 
has a lowest order as- As Q changes, the relative contributions of the different subprocesses 
change in size. This mixing of orders is to be expected in any problem where the parton 
densities have very different sizes, and is not incorrect, contrary to the assertion of Roberts 



and Thorne [ITT 



Notice that there is an implicit unitarity sum over final states in the whole of our work. As 
explained on page ^, this implies that the details of the final-state interactions do not affect 
factorization or the calculation of the coefficient functions. In particular, it is irrelevant that 
in Feynman-graph calculations, there are on-shell partons in the final-state, even though in 
the real-world there are only physical hadrons in the final-state. 



The word "parton" is used in two different senses in this sentence! 
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APPENDIX A: MISLEADING DERIVATION OF FORMULA FOR 
RENORMALIZED COEFFICIENT FUNCTION 

In this appendix we show some apparently correct manipulations can be used to justify 
a plausible but wrong formula for the renormalized coefficient function. The formula is 

= a . ^_(,_^^).^- ■(l-V).Z. (Al) 

(The subscript 'cand' is to indicate this is a candidate for the renormalized coefficient func- 
tion.) Expanded in powers of Kq this gives 

oo 

Ccand = C^o ■ E[(l - Z)K,T ■{1-V)-Z. (A2) 

n=0 

This candidate coefficient function has some properties that make it an obvious candidate 
for a renormalized coefficient function: 

• The factors oil — Z prevent there from being leading contributions from regions where 
the momenta on the left are much higher in virtuality than those on the right. 

• This includes the case that the left-hand momenta are hard momenta, of virtuality of 
order Q^, as in the leading regions Fig. |I|, as well as the momenta that give ultra-violet 
divergences. 

• Thus the only leading regions are where all the momenta in Ccand are of virtuality of 
order or where there is an ultra-violet divergence where all the momenta in some 
right-hand part of Ccand go to infinity. 

• The factor 1 — V cancels all the ultra-violet divergences. 

• The right-most factor of Z defines the standard approximation appropriate to defining 
a hard-scattering coefficient that is coupled to a collinear target factor. 

Therefore Ccand represents an obvious way of applying ultra-violet renormalization to the 
bare coefficient function defined in Eq. (|23|) . 

Let us now attempt to prove the factorization formula 

F = Ccand ® An + non-leading power. (A3) 

The following manipulations use just ordinary linear algebra, together with the definitions 
of Cb, Ab, and Aji, and the properties Z^ = Z and VZ = V: 
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cand 



A 



R 



Co 



1 - (1 - Z)Ko 



{1-V)ZG®Ab 



Cb{Z-V) 



Z -ZKn 



1 



B 



Z~V - {ZKo - VKo 



l-il-V)Ko 
1 



-V 



A 



B 



-V 



A 



B 



B 



l-{l-V)Ko 
Z -V + {l-Ko + VKo -1 + Ko-ZKo 

1 



B 



l-il-V)Ko 



-V 



A 



B 



Z-{l-{l-Z)Ko) 



V)Ko 



-V 



A 



B 



Cb®Ab 
Cb®Ab 
Cb®Ab 

Cb®Ab 



Co ^ 

l-{l-V)Ko 

Co^^il-Ko 



-VAf 



Co 
Co 



A-Ko 
1 

l-Ko 
1 



-VA 



1-K 



-V 



1 - (1 - V)Ko 
1 - (1 - V)Ko - VKo 

l-Ko- 



B 



V)Ko 



-VAf 







l-il-V)Ko 



-V 



An. 



(A4) 



In the second term of the extreme right- hand- side, we have a pole-part operation apphed 
to a quantity without uhra-violet divergences, Co/(l — Kq). This second term is therefore 
zero, and we appear to have proved Ccand ® = Cb ® Ab, which is sufficient to prove 
factorization, since Cb ® Ab equals the structure function F, up to a power-suppressed 
remainder. 

Unfortunately, the above derivation is false. It has assumed that the operation of taking 
the pole part obeys all the rules of linear algebra, including associativity. The problem can 
be seen at the first order in Ko- There are two terms on the left-hand-side of Eq. 



Coil-Z)Koil-V)Z [ZTo] + [CoZ] ZKo{l-V)To 



(A5) 



The square brackets are used to delimit factors belonging to the coefficient and to the 
operator. The terms with a pole-part are 



Co(l - Z)KoV [ZTo] - [CoZ] ZKoVTo 



CoZKoV 



[ZTo] - [CoZ] 



ZKoVTo 



(A6) 



where we have observed (correctly) that CqKq has no ultra-violet divergence. 

The two terms in Eq. ( [A^ ) appear to cancel. In fact this is not so. We are taking 



[pole pait{CoZ Kq)] Tq — CoZ [pole pait^Z Ko)] Tq. 



(A7) 



This is not, in general, zero, as can be seen by taking a simple mathematical example. Let 
us replace CoZ and ZKo by 
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CoZ =l + ae, ZKo = - + b. 

6 



Then ([A7|) becomes 



pole part 



[l + ae){- + b 



— (1 + ae)pole part 



which is clearly non-zero. 

Treating V as an associative operator has failed. 
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